the AI bench
VERIFIED JUNE 2026
All models

MODEL · SYSTRAN · ~809M (WHISPER LARGE-V3-TURBO DISTILLED)

faster-whisper large-v3-turbo

CTranslate2 reimplementation of OpenAI Whisper — 4× faster with int8 quantization, matches reference accuracy. The practical STT default.

License: MIT (library + weights) · Context: n/a · Released: Turbo support October 2024

The decision in five lines

The call
Skip for local — for voice
Best for
voice
Runs on
23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out
Speaker diarization or per-word timestamps at scale — WhisperX wraps faster-whisper with alignment + pyannote for that.
Evidence
Estimated · last verified April 2026

~809M (Whisper large-v3-turbo distilled)
PARAMETERS
STT
TYPE
CONTEXT
~1.5–2 GB (int8)
VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

VOICE · LOW
faster-whisper large-v3-turbo (int8)MIT; 99 languages; 4× faster than vanilla Whisper; the STT default at this tier.

The call

CTranslate2 reimplementation of OpenAI Whisper — 4× faster with int8 quantization, matches reference accuracy. The practical STT default.

When not to use: Speaker diarization or per-word timestamps at scale — WhisperX wraps faster-whisper with alignment + pyannote for that.

Runner notes

`pip install faster-whisper`. `WhisperModel("large-v3-turbo", compute_type="int8")` for CPU, `"int8_float16"` for GPU. Runs reasonably on ARM / Apple Silicon via CTranslate2.

License
MIT (library + weights)
Released
Turbo support October 2024
Maker
SYSTRAN

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this