WhisperX + pyannote 3.1

The canonical open-source pipeline for diarized transcription — wraps faster-whisper for ASR, wav2vec2 for alignment, pyannote for speaker segmentation.

License: WhisperX: BSD-4 · pyannote 3.1: CC-BY-4.0 (gated, HF token required) · Context: n/a · Released: WhisperX 3.x active 2024–2025 · pyannote 3.1 released November 2023 (pyannote 4.0 + community-1 shipped early 2026 as the newer iteration)

The decision in five lines

The call: Buy — for voice
Best for: voice
Runs on: 23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out: Real-time transcription — WhisperX batch-processes; latency is seconds, not milliseconds.
Evidence: Estimated · last verified July 2026

~1.5B (Whisper large-v3): PARAMETERS
STT + SPEAKER DIARIZATION: TYPE
—: CONTEXT
~6 GB with large-v3: VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

VOICE · HIGH

WhisperX + pyannote 3.1Whisper large-v3 alignment + speaker diarization in one pipeline; the only pick for multi-speaker transcripts.

The call

The canonical open-source pipeline for diarized transcription — wraps faster-whisper for ASR, wav2vec2 for alignment, pyannote for speaker segmentation.
When not to use: Real-time transcription — WhisperX batch-processes; latency is seconds, not milliseconds. For live use Parakeet or faster-whisper.

Runner notes

`pip install whisperx`. Requires HuggingFace token + explicit user-agreement acceptance on the pyannote Segmentation + Speaker-Diarization-3.1 gated repos before first run.

License: WhisperX: BSD-4 · pyannote 3.1: CC-BY-4.0 (gated, HF token required)
Released: WhisperX 3.x active 2024–2025 · pyannote 3.1 released November 2023 (pyannote 4.0 + community-1 shipped early 2026 as the newer iteration)
Maker: m-bain (WhisperX) · pyannote (pipeline)

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this→