the AI bench
VERIFIED JUNE 2026
All models

MODEL · ALIBABA · 27B DENSE

Qwen 3.5 27B

A dense, natively multimodal (text + image + video input) mid-large generalist — the biggest non-MoE in the Qwen 3.5 medium line. Best realistic pick for long-context docs on 24 GB VRAM or Mac 48 GB+.

License: Apache 2.0 · Context: 262K native, extendable to ~1M via YaRN · Released: February 24, 2026

The decision in five lines

The call
Consider — runnable locally, family reference
Best for
Local evaluation and family reference
Runs on
16 hardware picks fit (cheapest: Minisforum UM890 Pro · $463)
Watch out
When you need fastest possible tok/s on 24 GB VRAM — the 35B-A3B MoE sibling runs ~6× faster for comparable quality because only 3B activate per token.
Evidence
Measured · last verified April 2026

27B dense
PARAMETERS
DENSE
TYPE
262K
CONTEXT
~17 GB
VRAM AT Q4

Where we recommend this

This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.

The call

A dense, natively multimodal (text + image + video input) mid-large generalist — the biggest non-MoE in the Qwen 3.5 medium line. Best realistic pick for long-context docs on 24 GB VRAM or Mac 48 GB+.

When not to use: When you need fastest possible tok/s on 24 GB VRAM — the 35B-A3B MoE sibling runs ~6× faster for comparable quality because only 3B activate per token. Also slow prefill on Mac for 32K+ prompts.

Runner notes

Ollama tag `qwen3.5:27b`. Q4_K_M fits ~17 GB. MLX first-class on Apple Silicon; use vLLM or llama.cpp on CUDA for long-context work.

License
Apache 2.0
Released
February 24, 2026
Maker
Alibaba

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this