News

Gemini Robotics Models Unlock General-Purpose Dexterity in Real-World Robots

Remix Reality Newsroom

03 Apr 2025 — 1 min read

Source: Google

Google DeepMind’s new Gemini Robotics models enable robots to perform unseen, multi-step tasks with high dexterity.
The models integrate multimodal inputs with physical action, supporting embodied reasoning and vision-language-action control.

Google DeepMind has introduced a new family of Gemini Robotics models that enable robots to understand and perform complex physical tasks using natural language. These models are fine-tuned versions of Gemini 2.0, adapted with robot-specific data to allow real-time interactions with unfamiliar objects and environments. One test showed a robot executing a “slam dunk” with a toy hoop and ball it had never seen, based solely on the verbal prompt.

“Our mission is to build embodied AI to power robots that help you with everyday tasks in the real world,” said Head of Robotics, Carolina Parada on the Google Blog. “Eventually, robots will be just another surface on which we interact with AI, like our phones or computers — agents in the physical world.”

The lineup includes Gemini Robotics and Gemini Robotics-ER. The former is a vision-language-action model that advances robotic dexterity by solving multi-step tasks with smooth execution. The latter, built on Gemini 2.0 Flash, focuses on embodied reasoning—recognizing objects, predicting trajectories, and generating executable actions. The team trained on a wide range of tasks rather than repetitive single-task learning to foster generalization.

Robots powered by these models have packed lunchboxes, folded an origami fox, and cleaned whiteboards. They’ve been tested on various embodiments—from academic research arms to commercial humanoid systems—demonstrating versatility across tasks and platforms.

Source: YouTube/Google

🌀 Tom's Take:

The rise of humanoid robots highlights how spatial intelligence, powered by sensors and perception systems, is essential to real-world AI. Google DeepMind showcases this with Gemini Robotics, demonstrating how multimodal models give embodied agents a richer, more actionable understanding of the world.

Source: Google Blog

From Robotic Puppy Yoga to a Museum of Wearables: Spatial Computing Events @ SF Tech Week

Tech Week is back in San Francisco this week from October 6 to October 12. We’ve gathered a few standout events exploring the future of spatial computing, where physical AI, immersive design, and machine perception meet.

Figure Teases Its Latest Humanoid Robot, Figure 03

Figure has released a teaser video for its latest humanoid robot, Figure 03. The video does not provide much detail, but according to a post from founder Brett Adcock on LinkedIn, we can expect to learn more on Thursday, October 9.

Fujitsu Deepens NVIDIA Partnership to Build AI Systems for Robotics and Digital Twins

Fujitsu has expanded its partnership with NVIDIA to develop full-stack AI infrastructure for industrial use. The project will support digital twins in manufacturing and introduce physical AI, including robotics, to automate tasks and help solve labor shortages.

Energy Robotics Raises $13.5M to Expand AI Inspections in Energy and Industrial Sectors

Energy Robotics has raised $13.5 million in Series A funding to scale its AI software platform for autonomous inspections across oil, gas, chemical, and utility sectors. It will fuel broader rollout across complex industrial settings.

🌀 Tom's Take:

Read more

From Robotic Puppy Yoga to a Museum of Wearables: Spatial Computing Events @ SF Tech Week

Figure Teases Its Latest Humanoid Robot, Figure 03

Fujitsu Deepens NVIDIA Partnership to Build AI Systems for Robotics and Digital Twins

Energy Robotics Raises $13.5M to Expand AI Inspections in Energy and Industrial Sectors