Waymo Launches AI Model for Autonomous Driving
How multimodal models can be applied to autonomous driving while also exploring an end-to-end approach
Waymo has launched a new AI research model for autonomous driving.
The End-to-End Multimodal Model for Autonomous Driving (EMMA) was specifically trained and fine-tuned for autonomous driving, leveraging Gemini's world knowledge to better understand complex road scenarios.
Waymo released a research paper on its new model, which the company said demonstrates how multimodal models can be applied to autonomous driving, while also exploring the pros and cons of the pure end-to-end approach. “
“Building on top of Gemini and leveraging its capabilities, we created a model tailored for autonomous driving tasks such as motion planning and 3D object detection,” Waymo stated in the announcement.
The company said EMMA shows effective task transfer across key autonomous driving tasks. Trained on planner trajectory prediction, object detection and road graph understanding, EMMA showed improved performance compared to training separate models for each task, Waymo said.
Credit: Waymo
Waymo said the research suggests “a promising avenue of future research, where even more core autonomous driving tasks could be combined in a similar, scaled-up setup.”
“EMMA is research that demonstrates the power and relevance of multimodal models for autonomous driving,” said Drago Anguelov, Waymo vice president and head of research. “We are excited to continue exploring how multimodal methods and components can contribute towards building an even more generalizable and adaptable driving stack.”
Waymo’s research includes EMMA’s ability to process raw camera inputs and textual data to generate various driving outputs, a unified language space allowing EMMA to maximize Gemini’s world knowledge and its chain-of-thought reasoning to enhance the decision-making process and improve end-to-end planning.
Waymo said the significance of the research extends beyond autonomous vehicles. “By applying cutting-edge AI technologies to real-world tasks, we are expanding AI's capabilities in complex, dynamic environments.”
About the Author
You May Also Like