DeepMind Launches Genie 3 with Dynamic Scene Generation from Text Prompts

DeepMind Launches Genie 3 with Dynamic Scene Generation from Text Prompts
Source: Google DeepMind (Screenshot from video)
  • Genie 3 generates interactive 3D scenes in real time, with consistent physics and visual memory over minutes.
  • It builds each frame on the fly without using 3D scans, diverging from NeRFs and Gaussian splats.

Google DeepMind has released Genie 3, a new world model that generates interactive 3D scenes from text prompts. Users can move through these AI-generated scenes in real time, with visuals running at 720p and staying consistent for minutes, preserving layout, objects, and environmental details even when revisiting earlier locations.

Rather than relying on methods that use explicit 3D representations, like NeRF or Gaussian Splatting, Genie 3 creates each frame based on previous outputs and ongoing user actions. This allows users to interact in real time as the scene evolves and responds to their input.

The model introduces promptable world events, letting users change weather, objects, or characters mid-interaction, and maintains minute-scale memory for consistent replays. DeepMind has tested the system for embodied agent research and sees potential for applications in education, simulation, and evaluating agent performance.

Genie 3 is currently available in limited preview for select researchers and creators, with broader testing planned as development progresses.

Source: YouTube / Google DeepMind


🌀 Tom’s Take:

Genie 3 represents the future of gaming and virtual worlds. But it goes beyond entertainment. It unlocks dynamic simulations that will be key for training agents, scenario planning, and robotics. Generative world models like these are shaping the next interface layer and moving us closer to AGI.


Source: Google DeepMind