NVIDIA API Pricing
Real-time API token pricing for NVIDIA AI models including Llama Nemotron and other NVIDIA-hosted models. Compare input/output costs via OpenRouter.
12 active models ยท Data updated hourly
About NVIDIA
NVIDIA hosts and fine-tunes AI models through NVIDIA NIM (NVIDIA Inference Microservices) and makes them available via API. Their offerings include fine-tuned variants of Meta Llama and other open-weight models optimized for NVIDIA hardware. NVIDIA also produces their own models like Nemotron, designed for enterprise workloads on NVIDIA GPU infrastructure.
| Model | Released | Context | Input $/1M | Output $/1M | Modalities |
|---|---|---|---|---|---|
| NVIDIA: Nemotron 3 Ultra | Jun 2026 | 1M | $0.500 | $2.50 | Text |
| NVIDIA: Nemotron 3 Super | Mar 2026 | 1M | $0.090 | $0.450 | Text |
| NVIDIA: Nemotron 3 Nano 30B A3B | Dec 2025 | 262K | $0.050 | $0.200 | Text |
| NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 | Oct 2025 | 131K | $0.100 | $0.400 | Open Source |
| NVIDIA: Nemotron Nano 9B V2 | Sep 2025 | 131K | $0.040 | $0.160 | Text |
| NVIDIA: Nemotron 3.5 Content Safety (free) | Jun 2026 | 128K | Free | Free | Vision, Free |
| NVIDIA: Nemotron 3 Ultra (free) | Jun 2026 | 1M | Free | Free | Free |
| NVIDIA: Nemotron 3 Nano Omni (free) | Apr 2026 | 256K | Free | Free | Vision, Audio, Free |
| NVIDIA: Nemotron 3 Super (free) | Mar 2026 | 1M | Free | Free | Free |
| NVIDIA: Nemotron 3 Nano 30B A3B (free) | Dec 2025 | 256K | Free | Free | Free |
| NVIDIA: Nemotron Nano 12B 2 VL (free) | Oct 2025 | 128K | Free | Free | Vision, Free |
| NVIDIA: Nemotron Nano 9B V2 (free) | Sep 2025 | 128K | Free | Free | Free |