the AI bench
VERIFIED JULY 2026
All models

MODEL · QWEN (ALIBABA) · 1.7B AND 0.6B (BUILT ON THE QWEN3-OMNI AUDIO STACK)

Qwen3-ASR (1.7B / 0.6B)

Qwen’s first dedicated open-weight ASR family — language identification plus speech recognition across 52 languages and dialects (30 languages + 22 Chinese dialects), built on the Qwen3-Omni audio foundation. Qwen claims the 1.7B is state-of-the-art among open-source ASR and competitive with the strongest proprietary commercial APIs. Apache 2.0, transformers-native, and small enough to run on CPU or any consumer GPU.

License: Apache 2.0 · Context: n/a · Released: June 26, 2026

The decision in five lines

The call
Consider — runnable locally, family reference
Best for
Local evaluation and family reference
Runs on
23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out
Speaker diarization or per-word timestamps — pair it with WhisperX/pyannote, or stay on the Canary-Qwen pipeline.
Evidence
Estimated · last verified July 2026

1.7B and 0.6B (built on the Qwen3-Omni audio stack)
PARAMETERS
STT / ASR
TYPE
CONTEXT
~4 GB (1.7B) / ~1.5–2 GB (0.6B) at fp16
VRAM AT Q4

Where we recommend this

This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.

The call

Qwen’s first dedicated open-weight ASR family — language identification plus speech recognition across 52 languages and dialects (30 languages + 22 Chinese dialects), built on the Qwen3-Omni audio foundation. Qwen claims the 1.7B is state-of-the-art among open-source ASR and competitive with the strongest proprietary commercial APIs. Apache 2.0, transformers-native, and small enough to run on CPU or any consumer GPU.

When not to use: Speaker diarization or per-word timestamps — pair it with WhisperX/pyannote, or stay on the Canary-Qwen pipeline. This is recognition only, not TTS.

Runner notes

transformers-native (`Qwen/Qwen3-ASR-1.7B-hf` / `Qwen3-ASR-0.6B-hf`); also runs under vLLM/SGLang. Day-one Apache weights — pick the 0.6B for edge/CPU, the 1.7B for accuracy. Verify WER on your own audio before swapping a production STT pipeline.

License
Apache 2.0
Released
June 26, 2026
Maker
Qwen (Alibaba)

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this