the AI bench
VERIFIED JULY 2026
All models

MODEL · ALIBABA (QWEN) · 0.6B / 4B / 8B

Qwen3-Embedding (0.6B / 4B / 8B)

Qwen's embedding family — the 8B ranks #1 overall on MTEB Multilingual as of 2026, making the line the current best-quality open retrieval pick and displacing BGE-M3 at the top. Apache 2.0, three sizes so you can trade quality for footprint, with a matching Qwen3-Reranker family for two-stage retrieval.

License: Apache 2.0 · Context: 32K · Released: June 2025

The decision in five lines

The call
Consider — for docs
Best for
docs
Runs on
23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out
Ultra-cheap or CPU-only retrieval at scale, or maximum language breadth per byte — BGE-M3 (568M, 170+ languages) or nomic-embed stay lighter.
Evidence
Estimated · last verified July 2026

0.6B
PARAMETERS
EMBEDDING
TYPE
32K
CONTEXT
~1 GB (0.6B) / ~4 GB (4B) / ~8 GB (8B) at fp16
VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

DOCS · MID
Qwen3-Embedding-8B (Apache 2.0)#1 on MTEB overall as of 2026; the current best-quality open retrieval pick. Use BGE-M3 (568M) instead when you want cheap, broad multilingual breadth or CPU-only.

The call

Qwen's embedding family — the 8B ranks #1 overall on MTEB Multilingual as of 2026, making the line the current best-quality open retrieval pick and displacing BGE-M3 at the top. Apache 2.0, three sizes so you can trade quality for footprint, with a matching Qwen3-Reranker family for two-stage retrieval.

When not to use: Ultra-cheap or CPU-only retrieval at scale, or maximum language breadth per byte — BGE-M3 (568M, 170+ languages) or nomic-embed stay lighter. Choose by whether you need top MTEB quality (Qwen3-Embedding) or minimum footprint (BGE-M3).

Runner notes

sentence-transformers / FlagEmbedding / vLLM. `Qwen/Qwen3-Embedding-8B` for top quality, `-4B` / `-0.6B` for lighter rigs. Pair with `Qwen/Qwen3-Reranker-*` for a rerank stage. Instruction-aware — prefix queries with a task instruction for best results.

License
Apache 2.0
Released
June 2025
Maker
Alibaba (Qwen)

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this