Skip to content

Add NVIDIA Nemotron 3 Super deployment guide example page#85

Open
wbrennan899 wants to merge 2 commits intomainfrom
examples/nemotron-3
Open

Add NVIDIA Nemotron 3 Super deployment guide example page#85
wbrennan899 wants to merge 2 commits intomainfrom
examples/nemotron-3

Conversation

@wbrennan899
Copy link
Collaborator

Summary

  • Add end-to-end guide for deploying NVIDIA Nemotron 3 Super (120B/12B active MoE) on Vast.ai using SGLang
  • Covers instance search, deployment with FP8 quantization, and querying via OpenAI-compatible API
  • Documents all three reasoning modes (on, off, low-effort) with Python and cURL examples
  • Includes note about SGLang's nano_v3 parser behavior where reasoning-off responses are returned in reasoning_content instead of content

The search command filtered for disk_space>=150 but the hardware
requirements and instance creation both specify 200GB.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant