MODEL · OPENMOSS / MOSI.AI · 100M (0.1B)
MOSS-TTS-Nano (100M)
A 100M streaming-TTS that closes the multilingual gap Kokoro doesn't cover — 20 languages including English, Japanese, Korean, Spanish, French, Arabic, Mandarin, plus voice cloning from a short audio reference. 48 kHz stereo output, neural-audio-tokenizer + autoregressive LLM pipeline, runs real-time on 4 CPU cores. The ONNX build drops PyTorch entirely and gets ~2× the inference efficiency of the original.
License: Apache 2.0 · Context: n/a · Released: April 10, 2026 (PyTorch); April 17, 2026 (ONNX-CPU port)
The decision in five lines
- The call
- Consider — runnable locally, family reference
- Best for
- Local evaluation and family reference
- Runs on
- 23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
- Watch out
- English-only narration where you don't need cloning — Kokoro-82M is the proven default, ranks #1 on TTS Arena, and is even smaller.
- Evidence
- Estimated
- 100M (0.1B)
- PARAMETERS
- TTS + MULTILINGUAL VOICE CLONE
- TYPE
- —
- CONTEXT
- <400 MB (CPU-only, 4 cores enough)
- VRAM AT Q4
Where we recommend this
This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.
The call
A 100M streaming-TTS that closes the multilingual gap Kokoro doesn't cover — 20 languages including English, Japanese, Korean, Spanish, French, Arabic, Mandarin, plus voice cloning from a short audio reference. 48 kHz stereo output, neural-audio-tokenizer + autoregressive LLM pipeline, runs real-time on 4 CPU cores. The ONNX build drops PyTorch entirely and gets ~2× the inference efficiency of the original.
When not to use: English-only narration where you don't need cloning — Kokoro-82M is the proven default, ranks #1 on TTS Arena, and is even smaller. Use MOSS-TTS-Nano when you actually need multilingual coverage or voice cloning on hardware too small for Chatterbox/VoxCPM2.
Runner notes
GitHub `OpenMOSS/MOSS-TTS-Nano` for PyTorch path; HuggingFace `OpenMOSS-Team/MOSS-TTS-Nano-100M-ONNX` for the CPU-friendly route. Companion `MOSS-Audio-Tokenizer-Nano-ONNX` handles the audio tokenizer. No Ollama path (non-LM). Sibling MOSS-TTSD-v0.5 (2B, ZH/EN dialogue) covers multi-speaker if you outgrow Nano.
Hardware that fits
Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.
- Intel Arc B580 12 GBPerfect · 8.5× 12 GB · $249–$299
- NVIDIA RTX 3060 12 GBPerfect · 8.5× 12 GB · $280–$400
- Minisforum UM890 ProPerfect · 17.1× 32 GB DDR5 (shared) · $463–$580 all-in
- RTX 5060 Ti 16 GBPerfect · 11.4× 16 GB · $560–$610
- AMD Radeon RX 9070 XTPerfect · 11.4× 16 GB · $649–$779
- Mac Mini M4 16 GBPerfect · 7.6× 16 GB unified · $799 (new floor) / $499–$599 (eBay/residuals)
- AMD Radeon RX 7900 XTXPerfect · 17.1× 24 GB · $810 used / ~$1,340 new
- NVIDIA RTX 3090 (used, single)Perfect · 17.1× 24 GB · $950–$1,200
- NVIDIA RTX 5070 TiPerfect · 11.4× 16 GB · $980–$1,300
- NVIDIA RTX 5080Perfect · 11.4× 16 GB · $999–$1,400
- MacBook Air M5 24 GBPerfect · 11.4× 24 GB unified · $1,299–$1,699
- Mac Mini M4 Pro 24 GBPerfect · 11.4× 24 GB unified · $1,399
- Dual RTX 3090 (used)Perfect · 34.2× 48 GB · $1,800–$2,500 all-in
- Framework Desktop (Ryzen AI Max+ 395)Perfect · 61.0× 128 GB unified · $1,999–$2,851
- NVIDIA RTX 4090Perfect · 17.1× 24 GB · $2,200–$2,800
- M5 Pro MacBook Pro 48 GBPerfect · 22.9× 48 GB unified · $2,599–$3,099
- Mac Studio M4 Max 64 GBPerfect · 30.5× 64 GB unified · $3,199
- NVIDIA RTX 5090Perfect · 22.8× 32 GB · $3,500–$4,300
- NVIDIA RTX A6000 (48 GB, used)Perfect · 34.2× 48 GB ECC · $3,500–$4,500
- Mac Studio M3 Ultra 96 GBPerfect · 45.8× 96 GB unified · $3,999
- M5 Max MacBook Pro 64 GBPerfect · 30.5× 64 GB unified · $4,499
- NVIDIA DGX SparkPerfect · 61.0× 128 GB unified · $4,699
- Dual RTX 5090Perfect · 45.5× 64 GB (2×32) · $8,500–$10,500
Next step
Find-by-model — see what hardware runs this→