People of ACM - Pieter Abbeel
August 23, 2022
In a recent interview, you noted that the Robot Learning Lab at UC Berkeley made significant breakthroughs around 2013–2014. Will you give us an example of an important advance from that time period and the insight(s) that led to it?
In late 2012 there was the big breakthrough from Geoff Hinton and his students that showed it’s possible to train Deep Neural Networks from large amounts of images (annotated with what’s in the image) to recognize what’s in images. This was quite the tidal change for the computer vision field, which had traditionally (unsuccessfully) tried to solve image recognition by more hard-coded approaches, rather than this new pure learning / data driven approach.
Ultimately this big breakthrough result was about input-output pattern recognition. In my group we are interested in something beyond pattern recognition: robots learning behaviors, robots learning to complete tasks that can require many steps to complete, e.g., getting up, running, assembling parts, cooking, cleaning, etc.
So, the question we asked ourselves at that time: would deep neural nets also enable learning robot behaviors? Specifically, we investigated how deep neural nets could possibly improve the performance of reinforcement learning—which does learning by trial and error, rather than from annotated input-output examples. And, indeed, we were able to develop a Deep Reinforcement Learning algorithm called Trust Region Policy Optimization, which enabled training large neural networks with reinforcement learning. The early results included learning to walk, run, slither, and hop. In parallel to our work at Berkeley, Deepmind at that time was developing similar ideas to learn to play Atari games and Go. It was a really special time, where between Berkeley and Deepmind we were able to showcase very fundamental early advances in Deep Reinforcement Learning, a field that remains very active today, and that ultimately gets at a very core aspect of intelligence: the ability to improve from your own experiences.
What is an interesting challenge you and your colleagues are working on right now?
When I reflect on the last ten years of progress in Deep Reinforcement Learning, I observe tremendous progress on what I would call “Specialist Reinforcement Learning (RL) Agents.” These are agents that have learned through their own trial and error to solve very specific tasks, e.g., a specific Atari game, or a specific robot locomotion skill, or the game of Go, etc. If we care about such specific tasks, and if we are willing to spend enough compute to let the agent train for a very long time, then they will often exceed human level performance on their specific task.
However, while this can be useful for specific tasks, when we look at human intelligence, we see something quite different: humans are able to (i) learn faster and (ii) master a very wide range of skills. So, the challenge we are working on right now is: how do we achieve AI that makes for such Generalist RL Agents? And how can we do this without humans having to define a wide range of tasks for the agent to train on? I’d like to minimize human effort involved in training these Generalist RL Agents. So can these agents come up with their own task definitions, their own way of deciding what’s interesting to explore and try out in the world? And can this be done such that, when later we want to give this agent a specific task, it’s now capable of mastering that specific task very quickly, because it can build on the foundations it has learned on its own? That’s the most interesting academic research question to me right now.
A core focus of Covariant’s work is improving logistics at warehouses. Why is this important to the global economy? Because of AI-infused robots, how might a shipping warehouse function differently in 20 years?
If you look at today’s warehouses, you’ll see most of the “legwork,” i.e., bringing (often bulk) goods to their storage locations and retrieving them from there, has already been automated—through the use of conveyors and mobile robots. However, the “hand work” doing the pick and place operations, i.e., locating the correct individual item on a shelf or inside a bin, grabbing it, scanning it, and placing it into the correct outbound bin, shelf location, or conveyor, largely hasn’t been automated. Why that difference? It turns out the legwork can be automated with traditional robotic automation approaches. But the hand work requires a new generation of AI Robotics capabilities—robots that can learn, see, and react to what they see to get a task done. At Covariant we are building what we call the Covariant Brain, enabling robots to perform these manual tasks for the first time. And, indeed, in some of the more advanced warehouse facilities today, you’ll now see Covariant robots perform pick-and-place operations, enabling more efficient and reliable operations. The Covariant Brain provides these robots with the ability to understand the visual scenes in front of them, reliably identify the item to be picked, and the appropriate motor control to execute on the pick-and-place.
Supply chains and logistics is probably not something most people gave much thought—until the Covid-19 pandemic hit and we were (for a while) seeing all kinds of shortages in stores. Having robots help out with warehouse operations can not only make them more efficient, but also more robust to situations where human labor might not be available. Ensuring robustness has been at top of mind of many sellers. Beyond that, the warehousing industry has seen tremendous growth over recent years, largely driven by the rise in e-commerce. And having robots and humans work together to serve the logistics behind all this seems the most effective way forward to support this growth and really to support the ever-increasing consumer expectations.
Covariant is also interested in finding ways to make manufacturing more “flexible.” Will you explain what you mean by this and how using AI might achieve this goal?
Correct. And it’s actually worth thinking about this even more broadly than that. What we have focused on so far at Covariant in terms of commercialization have been pick-and-place operations in warehouse operations, this includes applications such as order fulfillment, parcel sortation and singulation, pick-to-light, palletization and depalletization, induction, etc. And our robots are capable of doing these across a very wide range of industries, including clothing, apparel, cosmetics, health and beauty, mechanical and electronic parts, and food.
The reason our robots are so capable across such a wide range of industries is the Covariant Brain, which at its core is a very large neural network trained on very large amounts of data to learn to perform these tasks. Now it turns out that the way we develop the Covariant Brain is not specific to warehouse operations. Our approach could also readily apply to flexible manufacturing, automation in agriculture, and, as far as I can tell, really any semi-structured environment. What we are developing is a very general technology, even if in terms of commercialization (and accordingly, where we collect our data for now) we currently have a more specific focus.
Given that building robots with AI draws on different specializations within computer science and engineering, what advice would you offer a younger colleague about what training they should take up to work in your field?
There is indeed a range of expertise required, but it’s still quite manageable. In terms of foundations, basic mathematics such as calculus, probability, linear algebra are very important, and also optimization. Taking physics classes can be very helpful, as it teaches you the skill of abstracting real world problem settings into equations. AI and Robotics progress remains very experimentally driven, so programming skills are critical. These days that often means Python as well as a deep learning framework (e.g., pytorch or tensorflow). If you are currently attending a university, you can probably find pretty good coverage of these topics in the course offerings. But all the same, if you are going to self-study, you’ll be fine. Many professors make their course materials available online for anyone to study. For example, at Berkeley you could check out CS188, CS182, CS285, CS287, CS294-158 (all materials freely available). If you master all five of these, you’ll actually have a pretty deep AI expertise.
Pieter Abbeel is a Professor at the University of California, Berkeley, where he is Director of the Berkeley Robot Learning Lab and Co-Director of the Berkeley Artificial Intelligence Research (BAIR) Lab. He is also Co-Founder, President, and Chief Scientist at Covariant, an AI robotics company. Abbeel pioneered teaching robots to learn from human demonstrations (“apprenticeship learning”) and through their own trial and error (“reinforcement learning”), which have formed the foundation for the next generation of robotics. He is also recognized as an ambassador for his field. In his regular The Robot Brains Podcast, he interviews an array of leading researchers and practitioners working in AI and robotics.
Abbeel is the recipient of the 2021 ACM Prize in Computing for contributions to robot learning. The ACM Prize in Computing recognizes early-to-mid-career computer scientists whose research contributions have fundamental impact and broad implications.