the AI bench
VERIFIED JUNE 2026
All models

MODEL · COHERE / COHERE LABS · 30B TOTAL / 3B ACTIVE (SPARSE MOE, 128 EXPERTS / 8 ACTIVE PER TOKEN)

North Mini Code (30B-A3B)

Cohere's open-weight 30B-A3B coder, tuned for code generation, agentic software engineering, and terminal tasks. Same 3B-active MoE shape and 24 GB-tier fit as Qwen3-Coder-30B-A3B, but from a Western lab under a clean Apache 2.0 license, with a larger 256K context. Cohere reports strong SWE-Bench Verified/Pro and Terminal-Bench v2 numbers — vendor figures, not yet independently reproduced.

License: Apache 2.0 · Context: 256K input / 64K max output · Released: June 5, 2026

The decision in five lines

The call
Buy — for coding
Best for
coding
Runs on
16 hardware picks fit (cheapest: Minisforum UM890 Pro · $463)
Watch out
General chat or non-coding work — it is specialized for software engineering, not a daily-driver assistant.
Evidence
Estimated · last verified June 2026

30B total
PARAMETERS
MOE
TYPE
256K
CONTEXT
~17 GB (Q4 GGUF) — fits a single 24 GB GPU
VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

CODING · TOP
North Mini Code (30B-A3B, Apache 2.0)Cohere's June 2026 Apache-2.0 30B-A3B agentic coder — 256K context, tuned for SWE + terminal tasks. A fresh alternative to Qwen3-Coder; vendor benchmarks, so verify on your own repo.
CODING · HIGH
North Mini Code (30B-A3B, Apache 2.0)Apache-2.0 30B-A3B from Cohere; ~17 GB at Q4 fits a 24 GB card. New alternative to Qwen3-Coder-30B-A3B with a larger 256K context — verify vendor benchmarks on your repo.

The call

Cohere's open-weight 30B-A3B coder, tuned for code generation, agentic software engineering, and terminal tasks. Same 3B-active MoE shape and 24 GB-tier fit as Qwen3-Coder-30B-A3B, but from a Western lab under a clean Apache 2.0 license, with a larger 256K context. Cohere reports strong SWE-Bench Verified/Pro and Terminal-Bench v2 numbers — vendor figures, not yet independently reproduced.

When not to use: General chat or non-coding work — it is specialized for software engineering, not a daily-driver assistant. Because the headline benchmarks are vendor-run, verify it against your own repo before displacing a proven local coder like Qwen3-Coder-30B-A3B.

Runner notes

Official BF16 + FP8 (plus W4A16 / NVFP4) on Hugging Face; community GGUF quants (unsloth, bartowski) and an mlx-community 4-bit build land it on a 24 GB GPU or Apple Silicon at ~Q4. Try it first in OpenCode or the CohereLabs Hugging Face Space before downloading.

License
Apache 2.0
Released
June 5, 2026
Maker
Cohere / Cohere Labs

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this