the AI bench
VERIFIED JUNE 2026
All models

MODEL · OPENBMB · 2B (DIFFUSION-AUTOREGRESSIVE, TOKENIZER-FREE; MINICPM-4 BACKBONE)

VoxCPM2 (2B)

Apache 2.0 TTS with 48 kHz output, short-clip zero-shot voice cloning, and natural-language "voice design" (describe a voice, get one — no reference audio required) across 30 languages.

License: Apache 2.0 · Context: n/a · Released: April 2026

The decision in five lines

The call
Buy — for voice
Best for
voice
Runs on
23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out
Strictly English narration on minimal hardware — Kokoro-82M is 25× smaller and equally good for that case.
Evidence
Estimated · last verified June 2026

2B (diffusion-autoregressive, tokenizer-free; MiniCPM-4 backbone)
PARAMETERS
TTS + VOICE CLONE + VOICE DESIGN
TYPE
CONTEXT
~6–8 GB
VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

VOICE · HIGH
VoxCPM2 (2B, Apache 2.0)30 languages, 48 kHz, tokenizer-free diffusion AR; voice design from text. April 2026 release.

The call

Apache 2.0 TTS with 48 kHz output, short-clip zero-shot voice cloning, and natural-language "voice design" (describe a voice, get one — no reference audio required) across 30 languages.

When not to use: Strictly English narration on minimal hardware — Kokoro-82M is 25× smaller and equally good for that case.

Runner notes

GitHub `OpenBMB/VoxCPM` (single repo, 18.9k stars) covers all three family members. No llama.cpp/Ollama route yet (non-LM architecture). Step-down options: `VoxCPM1.5` (0.6B, 44.1 kHz, January 2026) for mid-VRAM; `VoxCPM-0.5B` (0.5B, 16 kHz, EN/ZH only, September 2025) for low-VRAM. Primary references: HF model card, GitHub repo, arxiv 2509.24650 technical report.

License
Apache 2.0
Released
April 2026
Maker
OpenBMB

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this