High-performance LLM inference engine in C++/CUDA for NVIDIA Blackwell GeForce / RTX PRO (RTX 5090/5080/5070 Ti, RTX PRO 6000; sm_120). 200 tok/s decode on Qwen3.6-35B-A3B-NVFP4 MoE (RTX 5090).
-
Updated
May 14, 2026 - Cuda
High-performance LLM inference engine in C++/CUDA for NVIDIA Blackwell GeForce / RTX PRO (RTX 5090/5080/5070 Ti, RTX PRO 6000; sm_120). 200 tok/s decode on Qwen3.6-35B-A3B-NVFP4 MoE (RTX 5090).
Complete installation guide for ComfyUI-Hunyuan3DWrapper on NVIDIA Blackwell GPUs (RTX 5070 Ti, 5080, 5090) Covers custom_rasterizer manual compilation for sm_120 / compute_120 architecture.
Pop!_OS fixes for ASUS ROG Zephyrus G16 (2025) — boot, sleep, NVIDIA, speakers, OEM kernel
Add a description, image, and links to the rtx-5070-ti topic page so that developers can more easily learn about it.
To associate your repository with the rtx-5070-ti topic, visit your repo's landing page and select "manage topics."