VERIFIED JULY 2026

MODEL · MOONSHOT AI · 1T TOTAL / 32B ACTIVE (384 EXPERTS; 8 ROUTED + 1 SHARED PER TOKEN)

Kimi K2.6

Moonshot's 1T MoE agentic coder — keeps the K2 architecture (61 layers, 64 attention heads, MLA) and extends context to 256K. Tops SWE-Bench Pro at 58.6 (vs GPT-5.4 xhigh 57.7, Opus 4.6 max 53.4) and lands #4 on the Artificial Analysis Intelligence Index. The real differentiator is Agent Swarm — 300 sub-agents over 4,000 coordinated steps, a different shape of capability than single-step quality.

License: Modified MIT (commercial OK; attribution required only above 100M MAU or $20M MRR) · Context: 256K (across all variants) · Released: April 20, 2026

The decision in five lines

The call: Hosted only
Best for: coding · agents
Runs on: Hosted or workstation-class only · ~500 GB (not consumer-local; hosted or multi-GPU only)
Watch out: At ~500 GB Q4 this is a hosted-API or workstation-cluster model, not a Mac Studio or 5090 pick.
Evidence: Estimated · last verified July 2026

1T total: PARAMETERS
MOE: TYPE
256K: CONTEXT
~500 GB (not consumer-local; hosted or multi-GPU only): VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

CODING · TOP

Kimi K2.6 (1T MoE, frontier hosted)April 20 2026 release; tops SWE-Bench Pro at 58.6 (vs GPT-5.4 xhigh 57.7) and #4 on Artificial Analysis Index. Modified MIT — no commercial restrictions below 100M MAU. Hosted-only realistic at 1T params.

AGENTS · TOP

Kimi K2.6 (1T MoE, frontier hosted)Agent Swarm scales to 300 sub-agents over 4,000 coordinated steps; 256K context across all variants. April 20 2026 release; Modified MIT. Hosted-only realistic at 1T params.

The call

Moonshot's 1T MoE agentic coder — keeps the K2 architecture (61 layers, 64 attention heads, MLA) and extends context to 256K. Tops SWE-Bench Pro at 58.6 (vs GPT-5.4 xhigh 57.7, Opus 4.6 max 53.4) and lands #4 on the Artificial Analysis Intelligence Index. The real differentiator is Agent Swarm — 300 sub-agents over 4,000 coordinated steps, a different shape of capability than single-step quality.
When not to use: Anything that fits on a single consumer card. At ~500 GB Q4 this is a hosted-API or workstation-cluster model, not a Mac Studio or 5090 pick. Also skip if you need plain chat — K2.6 is tuned for long-horizon agent loops, the chat experience trails Qwen 3.6-27B at half the active-param budget.

Runner notes

Hosted via Moonshot API, OpenRouter, or self-deploy on multi-H100/H200 clusters. Unsloth dynamic GGUFs likely 1–2 weeks out; until then, the practical path is the API. The Modified MIT clause means attribution kicks in only at 100M MAU / $20M MRR, so most builders treat it as effectively MIT. Newer: Moonshot shipped Kimi-K2.7-Code (June 2026), a coding-tuned variant on the same 1T MoE, and then Kimi K3 (July 16 2026) — a 2.8T-A50B flagship with 1M context, API-first at $3/$15 with open weights promised by July 27; see the K3 fast take. K2.6 stays the base reference here.

License: Modified MIT (commercial OK; attribution required only above 100M MAU or $20M MRR)
Released: April 20, 2026
Maker: Moonshot AI
Model card: huggingface.co/moonshotai/Kimi-K2.6 →

Next step

Find-by-model — see what hardware runs this→