3D for everyone? Nvidia’s Magic3D can create 3D models from text

to enlarge / A poison dart frog rendered as a 3D model by Magic 3D.


On Friday, researchers at Nvidia announced Magic3D, an AI model that can create 3D models from textual descriptions. After entering a prompt like “A blue poison-dart frog is sitting on a water lily,” Magic3D creates a 3D mesh model, complete with colorful textures, in about 40 minutes. With modifications, the resulting model can be used in video games or CGI art scenes.

In its academic paper, Nvidia frames Magic3D as a response to DreamFusion, a text-to-3D model that Google researchers announced in September. In the same way that DreamFusion uses a text-to-image model to create a 2D image that is then optimized to volumetric NeRF (neural radiance field) data, Magic3D uses a two-stage process that takes a coarse model created at low resolution and optimizes it. . in high resolution. According to the paper’s authors, the resulting Magic3D method can create 3D objects two times faster than DreamFusion.

Magic3D can also perform prompt-based editing of 3D meshes. Given a low-resolution 3D model and a base prompt, it is possible to change the text to change the resulting model. Also, the authors of Magic3D preserve the same subject across generations (a concept often called composition) and apply the style of a 2D image (such as a Cubist painting) to a 3D model.

Nvidia did not publish any Magic 3D code with its academic paper.

The ability to create 3D from text feels like a natural evolution to today’s diffusion models, which use neural networks to synthesize novel content after intensive training on bodies of data. In 2022 alone, we saw the emergence of capable text-to-image models such as DALL-E and Stable Diffusion from Google and Meta, and rudimentary text-to-video generators. Google also debuted the aforementioned text-to-3D model DreamFusion two months ago, and since then, people have adapted similar techniques to work as an open-source model based on stable diffusion.

As for Magic3D, the researchers behind it hope it will allow anyone to create 3D models without the need for special training. Once refined, the resulting technology could accelerate video game (and VR) development and perhaps eventually find applications in special effects for film and TV. At the end of their paper, they write, “We hope that with Magic 3D, we can democratize 3D synthesis and unleash everyone’s creativity in creating 3D content.”