This week, at NVIDIA‘s highly anticipated annual GTC event, Jensen Huang announced the launch of Dynamo, its new open source inference framework for generative AI. Dubbed the “operating system for AI,” Dynamo is designed to optimize the speed and efficiency of AI models in order to facilitate large-scale deployments. Dynamo is the successor to NVIDIA’s Triton Inference Server, which it introduced in 2018. Since then, the exponential growth and increasing complexity of AI necessitated the development of more versatile orchestration technology:
Since the launch of Triton, open-source model sizes have grown dramatically—by almost 2,000x—and are now increasingly integrated into agentic AI workflows that require interaction with multiple other models. Deploying these models and workflows in production environments involves distributing them across multiple nodes, which demands careful orchestration and coordination across large fleets of GPUs. The complexity intensifies with the introduction of new distributed inference optimization methods, such as disaggregated serving, that split the response to a single user request across different GPUs. This makes collaboration and efficient data transfer between them even more challenging.
To tackle the challenges of distributed generative AI inference serving, we are releasing NVIDIA Dynamo. NVIDIA Dynamo is the successor to Triton, building on its success and offering a new modular architecture designed to serve generative AI models in multinode distributed environments.
Huang also announced that NVIDIA is working with 3 year-old AI startup Perplexity on implementations of Dynamo.
These announcements are significant because foundational OSS technologies like Dynamo that are backed by tech heavyweights like NVIDIA often give rise to new COSS ecosystems. And at the risk of sounding like a broken record, the fact that the announcement of Dynamo took center stage at Jensen Huang’s keynote lends credence to our view that the most promising realm of opportunity for COSS entrepreneurs in the AI landscape is the ‘pick and shovel’ technology that powers AI.
NVIDIA has made Dynamo available under the permissive Apache License.


Leave a Reply