Toyota Research Institute (TRI) has used generative AI to teach robots to make breakfast, or at least, the individual tasks needed to do so in hours with minimal coding and debugging.
The robots are given a sense of touch and plugged into an AI model, which learns by observing a human demonstrating the task. The sense of touch is akin to the tactile sensitivity found in human hands. This touch-sensitive interface empowers the AI model, enabling it to ‘perceive’ its actions, thus enriching its understanding of tasks. In return, previously complex actions become more manageable compared to relying solely on visual input.
Ben Burchfiel, a manager at TRI, expressed his enthusiasm, remarking on the robots’ interaction with their environments. The teaching process is initiated by a human ‘teacher’ who demonstrates a series of skills. Subsequently, the AI model autonomously assimilates this knowledge over a span of hours, rendering the robot capable of performing new behaviors.
The researchers are also developing “Large Behavior Models” (LBMs) for robots, which are similar to the large language models (LLMs) that are used for natural language processing. LBMs learn by observation and can generalize to new tasks without being explicitly taught.
TRI has already trained robots to perform over 60 challenging skills, such as pouring liquids, using tools, and manipulating deformable objects. They aim to increase this number to 1,000 by the end of 2024.
The sources for this piece include an article in TheVerge.