NVIDIA NeMo

Get Started
Microservices
Framework
Libraries
Blueprints
Deploy

NVIDIA NeMo is a modular software suite of APIs and libraries that help developers manage the AI agent lifecycle—building, deploying, and optimizing AI agents at scale.

NeMo Microservices

Easy-to-use containerized APIs for data preparation, model customization, evaluation, guardrailing, and continuously optimizing AI agents.

Browse

NeMo Framework

Flexible open-source framework for end to end training and development of generative AI models, scaling seamlessly from a single GPU to multi-node clusters.

Browse

NeMo Agent Toolkit

Open-source toolkit for evaluation-based development and optimization of an agentic system.

Browse

Easy-to-use containerized APIs for data preparation, model customization, evaluation, guardrailing, and continuously optimizing AI agents.

NVIDIA NeMo Microservices

A modular collection of containerized services exposed via intuitive APIs that enables developers to seamlessly integrate NeMo into existing platforms.

NeMo Data Designer (EA)

Build high-quality, use-case-specific datasets with fast previews, built-in evaluations, and scalable workflows

Browse

NeMo Customizer

Fine-tune language models with your proprietary data to build domain-specific AI agents.

Browse

NeMo Evaluator

Benchmark and monitor model and agent effectiveness with standard and custom metrics, including LLM-as-a-judge.

Browse

NeMo Retriever

High-accuracy retrieval augmented generation (RAG) pipelines with open-source models and privacy-preserving data access.

Browse

NeMo Guardrails

Add safety, policy, and topical control to model responses.

Browse

Develop multimodal generative AI models with the open-source NeMo Framework.

NVIDIA NeMo Framework

A modular open-source Python framework for large-scale pretraining, post-training, and reinforcement learning of multimodal generative AI models.

NeMo Curator

Clean, filter, and prepare multimodal data with a GPU-accelerated Python library.

Browse

NeMo RL

Align models with a scalable post-training library that integrates Hugging Face and Megatron optimizations.

Browse

NeMo Evaluator

Evaluate model performance with streamlined deployment, benchmark support, and advanced harnesses.

Browse

NeMo AutoModel

Train natively with accelerated PyTorch and finetune Hugging Face models on Day-0.

Browse

NeMo Megatron-Bridge

Train and fine-tune large models using a Megatron-Core parallelism with PyTorch-native training loop.

Browse

NeMo Guardrails

Add programmable safety, control, and compliance to LLM and agentic systems.

Browse

NeMo Run

Configure, execute, and track training or evaluation jobs across local, on-prem, and cloud clusters.

Browse

NeMo Export and Deploy

Export and deploy models to production using TensorRT, TensorRT-LLM, vLLM engines, and Triton backends.

Browse

NeMo VFM

Develop vision foundation models with a PyTorch-native training loop powered by both Megatron-Core and PyTorch backends.

Browse

NeMo Skills

Extend LLM capabilities with reference pipelines for synthetic data generation, training, and benchmark evaluation.

Browse

NeMo Speech

Train and deploy speech AI models, including ASR and TTS, with export support to NVIDIA Riva.

Browse

Monitor and optimize the performance of AI agents and multi-agent systems.

NeMo Agent Toolkit

Build, profile, evaluate, and optimize agentic systems with open-source, framework-agnostic observability toolkit.

Browse

Reference workflows with code, models, and deployment guides that helps developers quickly build and scale AI solutions.

AI-Q Deep Researcher

Build a custom deep researcher powered by state-of-the art models that continuously process and synthesize multimodal enterprise data, enabling reasoning, planning, and refinement to generate comprehensive reports.

Browse

Data Flywheel

Build a data flywheel, to continuously optimize AI agents for latency, cost, and accuracy using automated data curation, evaluation, and fine-tuning with NeMo microservices.

Browse

RAG

Continuously extract, embed, and index multimodal data for fast, accurate semantic search using NeMo Retriever models.

Browse

Deploy and manage AI workloads as scalable, performance-optimized services to seamlessly power enterprise-grade AI agents in production.

NVIDIA NIM

Containerized microservices for secure, performant and reliable deployment of AI models anywhere.

Browse

NVIDIA NIM Operator

Kubernetes-native operator for automating deployment, scaling, and lifecycle management of NIM and NeMo microservices.

Browse

NVIDIA AI Factory

Reference architectures standardizing hardware, networking, and software to build scalable, secure, and high-performance AI infrastructure for production set ups.

Browse