Skip to content

Search the site

AINVIDIACosmos

What exactly is NVIDIA’s new “COSMOS” platform?

New AI models, a host more and Jensen Huang claiming that "the ChatGPT moment for robotics is coming”? We have more...

Illustration credit: NVIDIA

NVIDIA has released several new “world foundation models” into the wild – coming under a “Cosmos” banner, they aim to help simulate real-world environments and predict outcomes based on text, image, or video input.

The big idea: Improved support for robotics from the company. (NVIDIA CEO Jensen Huang has made no secret of his interest in the intersection of AI and robotics – saying on an early 2024 call that there’s “just a giant suite of robotics companies that are emerging… warehouse robotics to surgical robotics to humanoid robotics, agriculture robotics companies.”)

The new NVIDIA AI models are available in three sizes: “Nano: Optimized for real-time, low-latency inference and edge deployment; Super: Designed as performant baseline models; Ultra: Focused on maximum quality and fidelity, ideal for distilling custom models” NVIDIA said, unveiling them amid a flurry of other product news releases at CES.

They’re part of a broader NVIDIA “Cosmos” platform which includes the models (available on Hugging Face), “advanced tokenizers, guardrails and an accelerated video processing pipeline… to advance the development of physical AI systems such as autonomous vehicles (AVs) and robots.”

Tokenisation breaks down complex data into manageable units so that models can process it more efficiently. NVIDIA claims (quelle surprise) that Cosmos features discrete technology that does this particularly well.

Its tokenisers are open neural networks on GitHub and Hugging Face

NVIDIA's take on the release.

Customers, NVIDIA says, can use Cosmos to “fine-tune generalist models using smaller, targeted datasets to create specialists tailored for specific applications, such as autonomous driving or humanoid robotics or they can generate customized synthetic scenarios, such as night scenes with emergency vehicles or high-fidelity industrial robotics environments.”

The NVIDIA Cosmos models were “trained on 9,000 trillion tokens, including 20 million hours of robotics and driving data” that the company said would offer developers “an easy way to generate massive amounts of photoreal, physics-based synthetic data to train and evaluate their existing models” and support “physically based interactions, object permanence, and high-quality generation of simulated industrial environments — like warehouses or factories…”

“The ChatGPT moment for robotics is coming. Like large language models, world foundation models are fundamental to advancing robot and AV development, yet not all developers have the expertise and resources to train their own,” said Jensen Huang in a canned statement. “We created Cosmos to democratize physical AI and put general robotics in reach of every developer.”

Latest