Google DeepMind Unveils Genie 3 for Real-Time Interactive World Generation

Google DeepMind Unveils Genie 3 for Real-Time Interactive World Generation
Source: Google DeepMind
  • Genie 3 creates 720p interactive environments from text prompts, supporting real-time navigation and world changes.
  • The model is available in a limited research preview for use in agent training, media creation, and simulation research.

Google DeepMind has announced Genie 3, a general-purpose world model that simulates interactive environments in real time. Users can navigate these dynamic scenes and trigger changes using both movement and text-based inputs, while the environments remain visually and physically consistent for several minutes. The model is part of DeepMind’s broader research into world simulation and is being explored for applications in agent training, generative media, and open-ended learning.

Source: YouTube / Google DeepMind

Genie 3 generates environments from text prompts by building scenes one frame at a time while referencing previous frames to stay consistent. It runs at 720p resolution and 24 frames per second, allowing users to navigate in real time. Unlike earlier versions, Genie 3 supports both movement-based interaction and “promptable world events,” where users can change conditions mid-scene, such as adding objects or altering the weather. The model does not rely on pre-built 3D assets; instead, it builds each frame dynamically, using memory of the evolving environment to maintain visual and physical continuity over time.

Early access to Genie 3 is being offered as a limited research preview to select academics and creators. Google cites potential applications, including virtual training, immersive media, and AI agent development, as just a few ways they imagine Genie 3 to be used. DeepMind is using this controlled rollout to gather interdisciplinary feedback and ensure responsible development as the technology evolves.


🌀 Tom’s Take:

Genie 3 marks a shift from passive video generation to real-time, interactive world simulation. This is a key step toward using generative models as environments, not just outputs.


Source: Google DeepMind