返回首页
Source: AtomGit Open Source Community
NVIDIA has officially open-sourced Cosmos3, the world’s first full multimodal large model tailored for the physical world. It tears down technical barriers between the digital and physical realms and drastically lowers the development thresholds for embodied intelligence, autonomous driving and industrial robots.
Core Architecture
Built on a hybrid Transformer framework, it natively integrates five modalities: text, image, video, ambient audio and motion. It fundamentally resolves data latency and synchronization errors commonly seen in traditional multi-model combinations.
Two Versions for Diverse Scenarios
Super Version (64.6 billion parameters): Designed for high-precision use cases in industrial scenarios, autonomous driving and humanoid robots.
Nano Version (15.7 billion parameters): Optimized for lightweight edge deployment with ultra-low latency.
Industry Impact
The model is available for free commercial use by developers worldwide after open-sourcing. Physical AI is no longer confined to tech giants’ labs and has ushered in an era of collective innovation. The robotics, VR/AR and industrial automation sectors are poised for explosive growth in technological breakthroughs.
英伟达 Cosmos3 全模态开源,物理 AI 全民可开发
来源:AtomGit开源社区
今日,英伟达正式开源Cosmos3—— 全球首款面向物理世界的全模态大模型,直接打通 “数字到现实” 的技术壁垒,具身智能、自动驾驶、工业机器人开发门槛断崖式下降。
核心架构:采用混合 Transformer,原生集成文本、图像、视频、环境音、动作五大模态,一次性解决传统多模型拼接的数据延迟、同步误差痛点。
双版本适配:646 亿参数 Super 版(工业 / 自动驾驶 / 人形机器人高精度场景)、157 亿参数 Nano 版(端侧轻量化低延迟部署)。
行业影响:开源后全球开发者可免费商用,物理 AI 从巨头实验室走向全民共创,机器人、VR/AR、工业自动化赛道将迎来爆发式创新。
NVIDIA Open-Sources Cosmos3 Multimodal Model, Making Physical AI Accessible to All Developers
Source: AtomGit Open Source Community
NVIDIA has officially open-sourced Cosmos3, the world’s first full multimodal large model tailored for the physical world. It tears down technical barriers between the digital and physical realms and drastically lowers the development thresholds for embodied intelligence, autonomous driving and industrial robots.
Core Architecture
Built on a hybrid Transformer framework, it natively integrates five modalities: text, image, video, ambient audio and motion. It fundamentally resolves data latency and synchronization errors commonly seen in traditional multi-model combinations.
Two Versions for Diverse Scenarios
Super Version (64.6 billion parameters): Designed for high-precision use cases in industrial scenarios, autonomous driving and humanoid robots.
Nano Version (15.7 billion parameters): Optimized for lightweight edge deployment with ultra-low latency.
Industry Impact
The model is available for free commercial use by developers worldwide after open-sourcing. Physical AI is no longer confined to tech giants’ labs and has ushered in an era of collective innovation. The robotics, VR/AR and industrial automation sectors are poised for explosive growth in technological breakthroughs.
英伟达 Cosmos3 全模态开源,物理 AI 全民可开发
来源:AtomGit开源社区
今日,英伟达正式开源Cosmos3—— 全球首款面向物理世界的全模态大模型,直接打通 “数字到现实” 的技术壁垒,具身智能、自动驾驶、工业机器人开发门槛断崖式下降。
核心架构:采用混合 Transformer,原生集成文本、图像、视频、环境音、动作五大模态,一次性解决传统多模型拼接的数据延迟、同步误差痛点。
双版本适配:646 亿参数 Super 版(工业 / 自动驾驶 / 人形机器人高精度场景)、157 亿参数 Nano 版(端侧轻量化低延迟部署)。
行业影响:开源后全球开发者可免费商用,物理 AI 从巨头实验室走向全民共创,机器人、VR/AR、工业自动化赛道将迎来爆发式创新。