the AI bench
VERIFIED JUNE 2026
All fast takes

FAST TAKE · 2026-04-16 · QWEN 3.6-35B-A3B

Qwen 3.6-35B-A3B — the MoE sibling that pairs with the dense 27B

Alibaba shipped the 35B-A3B MoE first, then the dense 27B six days later. 3B active per token, 35B total, Apache 2.0, 262K context. ~17 GB at Q4 — fits 24 GB cards comfortably. Picks between the sibling pair come down to dense-vs-MoE tradeoffs.

Verdict: Agentic-coding-focused MoE; 24 GB single-card at Q4


The take

Qwen 3.6-35B-A3B landed on HuggingFace April 16 — three days before the original publication snapshot of this site. 35B total parameters, 3B activated per token, MoE architecture, Apache 2.0, 262K native context. Function-calling and agentic reasoning specifically targeted in the post-3.5 refresh.

Native Ollama tags landed within ~1 week of release; MLX builds for Apple Silicon followed shortly after. The community fix in the gap: unsloth dynamic GGUFs (`unsloth/Qwen3.6-35B-A3B-GGUF`) gave best-quality Q4 access from day one.

The interesting comparison is the sibling decision. With Qwen 3.6-27B (dense) shipping six days later (April 22), users on 24 GB cards have a real choice between the two architectures at the same VRAM footprint:

**Pick 35B-A3B if:** you want raw throughput on chat / general reasoning. 3B active per token means ~6× faster token generation than a true 27B dense at comparable quality. Best for agent loops where wall-clock matters per step.

**Pick 27B dense if:** you're doing long-context documents or want the most predictable behavior. Dense models tend to be more stable on edge cases that confuse MoE routing. The community workhorse position is shifting toward 27B for serious docs work, 35B-A3B for chat + coding.

Both ship Apache 2.0, both fit 24 GB at Q4, both reach ~17 GB. They're not competitors — they're a matched pair, and the right answer depends on whether your workload values speed or stability.

Where this fits

Models: Qwen 3.6-35B-A3B · Qwen 3.6-27B · Qwen 3.5 35B-A3B

Hardware: NVIDIA RTX 5090 · NVIDIA RTX 4090 · NVIDIA RTX 3090 (used, single) · AMD Radeon RX 7900 XTX · Mac Studio M4 Max 64 GB

Sources

Next step

Try this in the planner