
Last January, renowned A.I. researcher Fei-Fei Li took a leave of absence from Stanford to trade academia for startup life. Nearly two years later, her venture World Labs has unveiled its first commercial product: a world model Marble. Marble can create 3D virtual worlds from text, images, video or even rough layouts. It builds on an earlier World Labs prototype that created 3D scenes from 2D images, but with limitations, such as restricted interactive areas.
So-called world models like Marble are central to Li’s vision of the future of A.I. Because these models can reason about and interact with complex environments, they are essential for building A.I. that understands not just language, but the physical world itself. World Labs aims to imbue its systems with spatial intelligence, teaching them physical concepts humans intuitively grasp, such as parking a car without bumping the curb, catching a tossed object, or pouring a drink without looking.
“Today, leading A.I. technology such as large language models (LLMs) have begun to transform how we access and work with abstract knowledge,” Li wrote in a Nov. 10 blog post. “Yet they remain wordsmiths in the dark; eloquent but inexperienced, knowledgable but ungrounded.”
An emphasis on visual and spatial intelligence has long been Li’s “North Star,” said the researcher, who in 2006 played a role in the release of ImageNet, a database of 15 million images that spurred the rise of deep learning. Li also co-directs Stanford’s Institute for Human-Centered A.I. and serves as a United Nations advisor on A.I. policy.
These days, however, Li is focused on World Labs, which has raised $230 million to pursue its spatial intelligence vision. Its backers include Radical Ventures, Andreessen Horowitz and Nvidia, as well as prominent tech figures such as Geoffrey Hinton, Eric Schmidt, Marc Benioff and Reid Hoffman.
Marble has been in beta for a few months and is now publicly available. It can create a full 3D world from a single image or text prompt. Users can also merge multiple environments by uploading several images within a prompt. According to World Labs, the model can combine photos or short videos of real-world spaces to generate immersive, realistic virtual worlds.
The model includes a range of editing tools that let users customize their creations. A feature called Chisel allows users to sketch out a coarse 3D layout, while other tools make it possible to expand worlds or build entirely new scenes within the same environment. Looking ahead, World Labs plans to develop world models with more interactive capabilities for both humans and A.I. agents.
While Li may be the most prominent figure developing world models, she isn’t the only one in the field. Google DeepMind and Nvidia have explored similar technologies with their their Genie and Cosmos models, respectively. Yann LeCun, Meta’s chief A.I. scientist, is reportedly in the early stages of fundraising for his own world model startup.
Li said the applications of spatial intelligence tools like Marble will “span varying timelines.” The model is already being used by filmmakers, game designers and architects to enhance creative workflows. In the medium term, Li expects such technology to advance robotics, while future applications in science, healthcare, and education could enable breakthroughs in experiment simulation, drug discovery and immersive learning.
“Spatial intelligence will transform how we create and interact with real and virtual worlds—revolutionizing storytelling, creativity, robotics, scientific discovery, and beyond,” said Li. “This is A.I.’s next frontier.”

