Qwen 3.5 27B

A dense, natively multimodal (text + image + video input) mid-large generalist — the biggest non-MoE in the Qwen 3.5 medium line. Best realistic pick for long-context docs on 24 GB VRAM or Mac 48 GB+.

License: Apache 2.0 · Context: 262K native, extendable to ~1M via YaRN · Released: February 24, 2026

The decision in five lines

The call: Consider — runnable locally, family reference
Best for: Local evaluation and family reference
Runs on: 16 hardware picks fit (cheapest: Minisforum UM890 Pro · $463)
Watch out: When you need fastest possible tok/s on 24 GB VRAM — the 35B-A3B MoE sibling runs ~6× faster for comparable quality because only 3B activate per token.
Evidence: Estimated · last verified July 2026

27B dense: PARAMETERS
DENSE: TYPE
262K: CONTEXT
~17 GB: VRAM AT Q4

Where we recommend this

This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.

The call

A dense, natively multimodal (text + image + video input) mid-large generalist — the biggest non-MoE in the Qwen 3.5 medium line. Best realistic pick for long-context docs on 24 GB VRAM or Mac 48 GB+.
When not to use: When you need fastest possible tok/s on 24 GB VRAM — the 35B-A3B MoE sibling runs ~6× faster for comparable quality because only 3B activate per token. Also slow prefill on Mac for 32K+ prompts.

Runner notes

Ollama tag `qwen3.5:27b`. Q4_K_M fits ~17 GB. MLX first-class on Apple Silicon; use vLLM or llama.cpp on CUDA for long-context work.

License: Apache 2.0
Released: February 24, 2026
Maker: Alibaba
Model card: huggingface.co/Qwen/Qwen3.5-27B →

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this→