I tried quantizing the model (Quantized Qwen3 Backbone to GPTQ, and the tokenizer to FP16) https://huggingface.co/blazingbhavneek/MOSS-TTS-GPTQ https://huggingface.co/blazingbhavneek/MOSS-Audio-Tokenizer-FP16 My repo with quantization scripts (Credit to Claude too :) https://github.com/blazingbhavneek/MOSS-TTS @gaoyang07