MOSS-TTS-Nano (100M)

A 100M streaming-TTS that closes the multilingual gap Kokoro doesn't cover — 20 languages including English, Japanese, Korean, Spanish, French, Arabic, Mandarin, plus voice cloning from a short audio reference. 48 kHz stereo output, neural-audio-tokenizer + autoregressive LLM pipeline, runs real-time on 4 CPU cores. The ONNX build drops PyTorch entirely and gets ~2× the inference efficiency of the original.

License: Apache 2.0 · Context: n/a · Released: April 10, 2026 (PyTorch); April 17, 2026 (ONNX-CPU port)

The decision in five lines

The call: Consider — runnable locally, family reference
Best for: Local evaluation and family reference
Runs on: 23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out: English-only narration where you don't need cloning — Kokoro-82M is the proven default, ranks #1 on TTS Arena, and is even smaller.
Evidence: Estimated · last verified July 2026

100M (0.1B): PARAMETERS
TTS + MULTILINGUAL VOICE CLONE: TYPE
—: CONTEXT
<400 MB (CPU-only, 4 cores enough): VRAM AT Q4

Where we recommend this

This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.

The call

A 100M streaming-TTS that closes the multilingual gap Kokoro doesn't cover — 20 languages including English, Japanese, Korean, Spanish, French, Arabic, Mandarin, plus voice cloning from a short audio reference. 48 kHz stereo output, neural-audio-tokenizer + autoregressive LLM pipeline, runs real-time on 4 CPU cores. The ONNX build drops PyTorch entirely and gets ~2× the inference efficiency of the original.
When not to use: English-only narration where you don't need cloning — Kokoro-82M is the proven default, ranks #1 on TTS Arena, and is even smaller. Use MOSS-TTS-Nano when you actually need multilingual coverage or voice cloning on hardware too small for Chatterbox/VoxCPM2.

Runner notes

GitHub `OpenMOSS/MOSS-TTS-Nano` for PyTorch path; HuggingFace `OpenMOSS-Team/MOSS-TTS-Nano-100M-ONNX` for the CPU-friendly route. Companion `MOSS-Audio-Tokenizer-Nano-ONNX` handles the audio tokenizer. No Ollama path (non-LM). Sibling MOSS-TTSD-v0.5 (2B, ZH/EN dialogue) covers multi-speaker if you outgrow Nano. Step-up: `OpenMOSS-Team/MOSS-TTS-v1.5` (8B, Apache 2.0, May 2026) is the flagship voice-cloning tier — community-reported to beat Fish Audio S2 Pro / Qwen3-TTS on English cloning; runs ~11 GB at Q4 on a 24 GB GPU. Use v1.5 when you have the VRAM and want top quality; Nano for CPU/edge.

License: Apache 2.0
Released: April 10, 2026 (PyTorch); April 17, 2026 (ONNX-CPU port)
Maker: OpenMOSS / MOSI.AI
Model card: huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M →

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this→