MIT researchers have developed a generative artificial intelligence-driven approach for planning long-term visual tasks, like robot navigation, that is about twice as effective as some existing techniques. Their method uses a specialized vision-language model to perceive the scenario in an image and simulate actions needed to reach a goal. Then a second model translates those simulations into a standard programming language for planning problems, and refines the solution.
Originally published by Tech Xplore https://techxplore.com/rss-feed/robotics-news/