Researchers at MIT’s Electrical Engineering and Computer Science (EECS) department are using large language models (LLMs) to equip robots with the “common sense knowledge” they need to be helpful around the house. The team has found a way to connect a robot’s physical motion with AI models that have thus far been used to generate content. This approach allows the robot to split a task into subtasks, adjust to unexpected events without starting over, and importantly, designers don’t have to program robots with fixes for every possible eventuality.
The MIT study, led by graduate student Yanwei Wang, replaced words with subtasks in the LLMs. The system was tested with a robotic arm attempting to scoop marbles from one bowl to another. The LLM might produce a sequence like “reach” or “pour,” which is mapped to the robot’s physical movements. The robot’s algorithm is known as a “grounding classifier,” meaning it can learn how to identify subtasks based on where the arm is in space.
The team started by guiding the arm through the scooping task and then used a pre-trained LLM to list subtasks. The algorithm was able to match the subtasks to the robot’s physical movements. With that done, the researchers allowed the robot to go about its business, scooping marbles and dumping them in another bowl. When the robot had that figured out, the team began interrupting the arm and nudging it out of alignment. Traditional control algorithms would have needed to go back to a known starting point, but the LLM-powered robot was able to understand where it was during each disruption and could simply pick up where it left off. This work could lead to household helpers that can adapt to their environment and better cope with external complications.
read more > www.extremetech.com