Qwen 3.6-27B — a dense Q4 model that claims to beat the prior 397B MoE flagship

A week after Qwen 3.6-35B-A3B, Alibaba shipped the dense 27B. Apache 2.0, 262K native context, multimodal, ~17 GB at Q4. The community claim worth weighing: it beats the prior generation's 397B MoE on coding while staying single-card.

Verdict: Dense 27B that supersedes 3.5-27B; the new single-card top pick

The take

Qwen 3.6-27B landed on HuggingFace April 22 — a clean dense follow-up to the 35B-A3B MoE that dropped April 16. Apache 2.0, 262K native context extensible to ~1M via YaRN, fully multimodal. Same ~17 GB Q4 footprint as the 35B-A3B, but with dense parameter activation rather than MoE routing.

The community claim worth weighing carefully: Qwen 3.6-27B at Q4 reportedly beats the prior generation's 397B MoE on coding benchmarks. That's a non-trivial assertion — the 397B was a hosted-tier model, and seeing a single-card-deployable model match or beat it would be a real generational shift in the dense-vs-MoE tradeoff at this scale.

What's different from 3.5-27B: architecture refresh, native multimodality (3.5 was vision-on-multimodal-only), tool-use stability that the 3.5 line lacked at this size, and slightly faster prefill in early community testing. Same context window, same VRAM footprint. Functionally a drop-in upgrade.

Practical move: if you're running Qwen 3.5 27B, swap to 3.6-27B today via unsloth or bartowski GGUFs. Native Ollama tags will land within 1–2 weeks. If you're on Qwen 3.5 35B-A3B for chat, the MoE sibling Qwen 3.6-35B-A3B is probably the right swap; if you're on the dense 27B for long-context docs work, this is the upgrade. Both already lead chat.top + docs.top on this site as of April 28.

Where this fits

Models: Qwen 3.6-27B · Qwen 3.6-35B-A3B · Qwen 3.5 27B

Hardware: NVIDIA RTX 5090 · NVIDIA RTX 4090 · NVIDIA RTX 3090 (used, single) · AMD Radeon RX 7900 XTX

Sources

Next step

Try this in the planner→