According to DeepMind , it has created an AI model called RoboCat that can operate a variety of robotic arm models and perform a variety of tasks. That isn’t particularly novel on its own. However, DeepMind asserts that the model is the first to be able to handle numerous tasks and adapt to them while using various, actual robots.
According to Alex Lee, a research scientist at DeepMind and a team member on the RoboCat project, “we demonstrate that a single large model can solve a diverse set of tasks on multiple real robotic embodiments and can quickly adapt to new tasks and embodiments.”
RoboCat, which drew inspiration from Gato, a DeepMind AI model capable of analyzing and acting on text, images, and events, underwent training using a combination of images and actions data obtained from robotics in both simulation and real-life scenarios.The data, Lee says, came from a combination of other robot-controlling models inside of virtual environments, humans controlling robots and previous iterations of RoboCat itself.
Demonstrations of a Task
To train RoboCat, researchers at DeepMind first collected between 100 to 1,000 demonstrations of a task or robot using a robotic arm controlled by a human. (Think having a robot arm pick up gears or stack blocks.) Then, they fine-tuned RoboCat on the task, creating a specialized “spin-off” model that practiced on the task an average of 10,000 times.
Leveraging both the data generated by the spin-off models and the demonstration data, the researchers continuously grew RoboCat’s training dataset — and trained subsequent new versions of RoboCat.
The final version of the RoboCat model underwent training on a total of 253 tasks and underwent benchmarking on a set of 141 variations of these tasks, both in simulation and in the real world.
DeepMind claims that, after observing 1,000 human-controlled demonstrations collected over several hours, RoboCat learned to operate different robotic arms.
Although RoboCat had received training on four types of robots with two-pronged arms, the model demonstrated the ability to adapt to a more complex arm equipped with a three-fingered gripper and twice the number of controllable inputs.