MODEL · COHERE / COHERE LABS · 30B TOTAL / 3B ACTIVE (SPARSE MOE, 128 EXPERTS / 8 ACTIVE PER TOKEN)
North Mini Code (30B-A3B)
Cohere's open-weight 30B-A3B coder, tuned for code generation, agentic software engineering, and terminal tasks. Same 3B-active MoE shape and 24 GB-tier fit as Qwen3-Coder-30B-A3B, but from a Western lab under a clean Apache 2.0 license, with a larger 256K context. Cohere reports strong SWE-Bench Verified/Pro and Terminal-Bench v2 numbers — vendor figures, not yet independently reproduced.
License: Apache 2.0 · Context: 256K input / 64K max output · Released: June 5, 2026
The decision in five lines
- The call
- Buy — for coding
- Best for
- coding
- Runs on
- 16 hardware picks fit (cheapest: Minisforum UM890 Pro · $463)
- Watch out
- General chat or non-coding work — it is specialized for software engineering, not a daily-driver assistant.
- Evidence
- Estimated
- 30B total
- PARAMETERS
- MOE
- TYPE
- 256K
- CONTEXT
- ~17 GB (Q4 GGUF) — fits a single 24 GB GPU
- VRAM AT Q4
Where we recommend this
Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.
The call
Cohere's open-weight 30B-A3B coder, tuned for code generation, agentic software engineering, and terminal tasks. Same 3B-active MoE shape and 24 GB-tier fit as Qwen3-Coder-30B-A3B, but from a Western lab under a clean Apache 2.0 license, with a larger 256K context. Cohere reports strong SWE-Bench Verified/Pro and Terminal-Bench v2 numbers — vendor figures, not yet independently reproduced.
When not to use: General chat or non-coding work — it is specialized for software engineering, not a daily-driver assistant. Because the headline benchmarks are vendor-run, verify it against your own repo before displacing a proven local coder like Qwen3-Coder-30B-A3B.
Runner notes
Official BF16 + FP8 (plus W4A16 / NVFP4) on Hugging Face; community GGUF quants (unsloth, bartowski) and an mlx-community 4-bit build land it on a 24 GB GPU or Apple Silicon at ~Q4. Try it first in OpenCode or the CohereLabs Hugging Face Space before downloading.
Hardware that fits
Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.
- Minisforum UM890 ProGood · 1.3× 32 GB DDR5 (shared) · $463–$580 all-in
- AMD Radeon RX 7900 XTXGood · 1.3× 24 GB · $810 used / ~$1,340 new
- NVIDIA RTX 3090 (used, single)Good · 1.3× 24 GB · $950–$1,200
- MacBook Air M5 24 GBRequires tweak · 1.2× 24 GB unified · $1,299–$1,699
- Mac Mini M4 Pro 24 GBRequires tweak · 1.2× 24 GB unified · $1,399
- Dual RTX 3090 (used)Perfect · 2.6× 48 GB · $1,800–$2,500 all-in
- Framework Desktop (Ryzen AI Max+ 395)Perfect · 4.6× 128 GB unified · $1,999–$2,851
- NVIDIA RTX 4090Good · 1.3× 24 GB · $2,200–$2,800
- M5 Pro MacBook Pro 48 GBPerfect · 1.7× 48 GB unified · $2,599–$3,099
- Mac Studio M4 Max 64 GBPerfect · 2.3× 64 GB unified · $3,199
- NVIDIA RTX 5090Perfect · 1.7× 32 GB · $3,500–$4,300
- NVIDIA RTX A6000 (48 GB, used)Perfect · 2.6× 48 GB ECC · $3,500–$4,500
- Mac Studio M3 Ultra 96 GBPerfect · 3.5× 96 GB unified · $3,999
- M5 Max MacBook Pro 64 GBPerfect · 2.3× 64 GB unified · $4,499
- NVIDIA DGX SparkPerfect · 4.6× 128 GB unified · $4,699
- Dual RTX 5090Perfect · 3.5× 64 GB (2×32) · $8,500–$10,500
Next step
Find-by-model — see what hardware runs this→