One-command vLLM installation for NVIDIA DGX Spark with Blackwell GB10 GPUs (sm_121 architecture)
-
Updated
Oct 28, 2025 - Shell
One-command vLLM installation for NVIDIA DGX Spark with Blackwell GB10 GPUs (sm_121 architecture)
Serve the home! Inference stack for your Nvidia DGX Spark aka the Grace Blackwell AI supercomputer on your desk. Mostly vLLM based for now and single-spark. For the not-so-rich buddies
Local diagnostic CLI for NVIDIA DGX Spark (GB10). Detects power caps, unified memory pressure, thermal risk, Docker/runtime issues, and validates vLLM/Ollama/llama.cpp/SGLang recipes.
headless remote desktop to your dgx spark in crystal clear 4k
Single-file web UI for NVIDIA DGX Spark — pull Ollama models, browse and download from HuggingFace, manage LiteLLM routing, and control SGLang, vLLM, llama.cpp, LocalAI, and ComfyUI. All from one browser tab.
Turn any NVIDIA GPU into a local AI platform. Inference + fine-tuning in your browser. One command to start, automatic clustering.
Operator-grade GPU monitor for NVIDIA GPUs with native GB10 / DGX Spark coherent UMA support — PSI pressure, clock detection, ConnectX-7 network layer
(Experimental) A high-throughput and memory-efficient inference and serving engine for LLMs on DGX Spark / GB10
GPU/CUDA-accelerated voice control stack for Home Assistant. Runs on x86/x64 and ARM64 (including the NVIDIA DGX Spark). 100% Local - No Cloud, No Subscriptions.
SGLang optimizations for NVIDIA Spark (GB10) — SM121 Grace Blackwell
llama.cpp fork optimized for NVIDIA DGX Spark / GB10 (Blackwell, SM 12.1) — TurboQuant weights + KV, NVFP4, DFlash MTP
Enhanced GPU throttle diagnostic for DGX Spark (GB10): NVML direct telemetry, throttle cause decoder, PCIe link monitoring, baseline drift detection, timeline capture.
DGX Spark (GB10/SM121) platform support for Meta's KernelAgent — auto-detect, hardware constraints, safe Triton configs
Pre-built PyTorch wheels and build scripts for NVIDIA DGX Spark (GB10, sm_121, Blackwell, CUDA 13.0, ARM64)
Run GPT-OSS 120B on NVIDIA DGX Spark with vLLM, build an API server, and create a local AI coding assistant
This project is the ARM architecture version of unsloth.
Solo-built agentic AI ecosystem from Switzerland on a 100W NVIDIA GB10 Blackwell desktop supercomputer. Cognitive robotics (Unitree Go2 + Isaac Sim 5.1 + RL PPO + GR00T N1.7), local-first BI (DuckDB + LLM NL→SQL), and LLM-reasoning EDR cybersecurity. Showcase: articles, technical docs, demo videos.
Add a description, image, and links to the gb10 topic page so that developers can more easily learn about it.
To associate your repository with the gb10 topic, visit your repo's landing page and select "manage topics."