MODEL · Z.AI (FORMERLY ZHIPU AI) · 744B TOTAL / 40B ACTIVE
GLM-5.1
Current #1 open-weight on SWE-Bench Pro (58.4) — a long-horizon agentic coding flagship that narrowly beats GPT-5.4 and Claude Opus 4.6 on that benchmark. MIT license means no commercial restrictions, unlike many frontier opens.
License: MIT · Context: 200K tokens (131K max output) · Released: April 7, 2026
The decision in five lines
- The call
- Hosted only
- Best for
- agents
- Runs on
- Hosted or workstation-class only · ~466 GB (not consumer-local)
- Watch out
- At Q4_K_XL it's 466 GB on disk and needs a data-center GPU partition or a Mac Studio 512 GB.
- Evidence
- Estimated
- 744B total
- PARAMETERS
- MOE
- TYPE
- 200K
- CONTEXT
- ~466 GB (not consumer-local)
- VRAM AT Q4
Where we recommend this
Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.
The call
Current #1 open-weight on SWE-Bench Pro (58.4) — a long-horizon agentic coding flagship that narrowly beats GPT-5.4 and Claude Opus 4.6 on that benchmark. MIT license means no commercial restrictions, unlike many frontier opens.
When not to use: Any single-card local setup. At Q4_K_XL it's 466 GB on disk and needs a data-center GPU partition or a Mac Studio 512 GB. For most users, the honest path is `glm-5.1:cloud` via Ollama rather than pretending it runs locally.
Runner notes
Unsloth dynamic 2-bit (220 GB) or 1-bit (200 GB) GGUFs make workstation-class local runs *possible* but slow. Ollama has `glm-5.1` + `glm-5.1:cloud` tags — `:cloud` is the honest choice for consumer hardware.
Next step
Find-by-model — see what hardware runs this→