2D-to-3D Goes Mainstream AI systems from Stability AI and Shutterstock transform 2D images into 3D meshes in seconds

Published
Reading time
2 min read
Collection of various toys, including a raccoon, a bus, and a tree.

Traditionally, building 3D meshes for gaming, animation, product design, architecture, and the like has been labor-intensive. Now the ability to generate 3D meshes from a single image is widely available.

What’s new: Two companies launched systems that produce a 3D mesh from one image. Stability AI released SF3D. Its weights and code are freely available to users with annual revenue under $1 million. Meanwhile, Shutterstock launched a service that provides a similar capability.  

How it works: Stability AI’s SF3D generates output in a half-second, while Shutterstock’s service takes around 10 seconds.

  • SF3D has five components: (1) a transformer that produces an initial 3D representation of an input image; (2) a model based on CLIP that uses the image to estimate how metallic and rough the object’s surface texture is; (3) a convolutional neural network that, given the transformer’s output, estimates how light reflects off the surface; (4) a model based on Deep Marching Tetrahedra (DMTet) that smooths the transformer’s output; and (5) an author-built algorithm that separates the 3D mesh from the surface texture map.
  • Shutterstock’s service, developed by TurboSquid (which Shutterstock acquired in 2021) and Nvidia, is due to launch this month. The company hasn’t disclosed pricing or how the system works. Users can specify an object and surroundings including light sources via an image or text description.

Behind the news: These releases arrived amid a flurry of recent works that aim to tackle similar problems. Most are based on Large Reconstruction Model (LRM), proposed by Adobe in late 2023, which produces a 3D mesh and surface texture from a single image in less than 5 seconds. Follow-up work trained LRM on real-world images in addition to the images of synthetic 3D meshes used in the original work and then reproduced LRM’s capabilities in an open source model. Further research extended the model to learn from generated videos. Stability AI’s new system addresses issues in its own previous work that was based on LRM.

Why it matters: SF3D replaces NeRF, a 2D-to-3D approach proposed in 2020 that serves as the basis for LRM and several other methods, with DMTet, which incorporates surface properties to achieve smoother meshes and better account for light reflecting off object surfaces.

We’re thinking: 3D generation is advancing rapidly. To ignore this technology would be a mesh-take!

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox