Orpheus-TTS 3B

Llama-backbone TTS tuned for naturalness and emotion. Multilingual FTs (Spanish / Italian / French / Hindi) released as research artifacts.

License: Apache 2.0 · Context: n/a · Released: March 18, 2025 (multilingual FTs April 10, 2025)

The decision in five lines

The call: Consider — for voice
Best for: voice
Runs on: 23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out: Tight VRAM or CPU-only deploy — Kokoro-82M is 30× smaller and competitive for narration.
Evidence: Estimated · last verified July 2026

3B (Llama-backbone): PARAMETERS
TTS + ZERO-SHOT VOICE CLONE: TYPE
—: CONTEXT
~3 GB (Q4) / ~6–8 GB (FP16): VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

VOICE · MID

Orpheus-TTS 3BApache 2.0; Llama-3 backbone; ~200ms streaming; 8 English voices + zero-shot clone.

The call

Llama-backbone TTS tuned for naturalness and emotion. Multilingual FTs (Spanish / Italian / French / Hindi) released as research artifacts.
When not to use: Tight VRAM or CPU-only deploy — Kokoro-82M is 30× smaller and competitive for narration. Orpheus needs real GPU.

Runner notes

Python via `orpheus-tts` GitHub repo. GGUF quants exist (`QuantFactory/orpheus-3b-0.1-ft-GGUF`) for llama.cpp.

License: Apache 2.0
Released: March 18, 2025 (multilingual FTs April 10, 2025)
Maker: Canopy Labs
Model card: huggingface.co/canopylabs/orpheus-3b-0.1-ft →

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this→