Google Debuts Third-Gen AI for Turning Product Photos into 3D Shopping Models

Google Debuts Third-Gen AI for Turning Product Photos into 3D Shopping Models
Source: Midjourney (AI-generated image, not an official product representation)
  • Google's new Veo-powered system can generate 360° product spins from as few as three images.
  • The latest approach generalizes across categories like furniture, apparel, and electronics without requiring precise camera pose estimation.

Google has introduced a generative AI breakthrough that transforms flat product photos into interactive 3D shopping experiences. Built on Veo, the company’s advanced video generation model, this third-generation approach can produce high-fidelity, 360° spins of products from minimal image input. The technology is already live on Google Shopping, powering dynamic views of items such as shoes, furniture, and electronics.

Previous iterations relied on Neural Radiance Fields (NeRFs) and view-conditioned diffusion models, which required more images and complex pose estimation. These methods struggled with thin or detailed objects like sandals and heels. Veo simplifies this process by using a curated dataset of synthetic 3D assets and learning to generate realistic video views based on product images, effectively capturing nuanced material and lighting effects.

Veo’s approach operates without requiring precise camera angles, simplifying the process and enhancing reliability. With just a few strategically placed images, ideally three that display most of the product, it can create realistic, interactive 3D models ready for online shopping. What distinguishes it is its capacity to capture how light reflects off various materials and textures, adding depth and realism to the final product. For retailers, this facilitates the creation of immersive shopping experiences at scale, bringing the sensation of in-store browsing to the screen.


🌀 Tom's Take:

3D is a fundamental component of spatial computing experiences, yet for many brands, creating 3D assets remains time-consuming and costly. Google’s use of generative AI to transform existing assets, like video, into shoppable 3D renderings represents a breakthrough that can enable more brands to participate in this next wave of computing.


Source: Google Research Blog