GLM-5.2 — Z.AI's flagship gets a solid 1M context and an architecture rework

Z.AI shipped GLM-5.2 on June 16 — the successor to GLM-5.1, still MIT, still ~753B total in the GLM-5 sparse-MoE line. The headline is a stable 1M-token context (a first for the family) plus an architecture rework (IndexShare) that cuts per-token FLOPs ~2.9× at 1M, and on Z.AI's own card it beats GLM-5.1 across reasoning, coding, and long-horizon agent suites. Like 5.1, the size keeps it hosted / big-iron, so it changes no local planner pick — it's a frontier comparator the way DeepSeek V4-Pro and Kimi K2.6 are.

Verdict: Z.AI's new MIT flagship — a real jump over GLM-5.1 with a solid 1M context, but 753B keeps it hosted / big-iron

The take

The facts, verified against the Hugging Face model card (`zai-org/GLM-5.2`, created 2026-06-16, MIT) and the Z.AI blog: GLM-5.2 is Z.AI's new flagship for long-horizon tasks, ~753B total parameters in the same GLM-5 sparse-MoE lineage as GLM-5.1. New this release: a "solid 1M context" the card says "stably sustains long-horizon work"; IndexShare, which reuses one indexer across every four sparse-attention layers to cut per-token FLOPs ~2.9× at 1M context; an improved MTP layer for speculative decoding (up to ~20% longer acceptance); and flexible coding "effort" levels. The license stays a clean MIT — "no regional limits," in Z.AI's framing.

Why it matters: the open-weight frontier keeps moving, and GLM is one of the few frontier-class labs shipping under a genuinely permissive license. On Z.AI's own benchmark table GLM-5.2 posts clear gains over GLM-5.1 — SWE-bench Pro 62.1 vs 58.4, Terminal-Bench 2.1 81.0 vs 63.5, HLE 40.5 vs 31 — and trades blows with Opus 4.8 / GPT-5.5 / Gemini 3.1 Pro on several. Treat those as vendor numbers until third parties reproduce them, but the direction is real: a 1M-context MIT model that's competitive at the frontier is a meaningful option for anyone who can host it.

Our call: no planner-pick change. At ~753B total, GLM-5.2 is hosted / multi-GPU class regardless of the permissive license — even Unsloth dynamic low-bit GGUFs are workstation-cluster territory, not a single consumer rig. We track it where GLM-5.1 already sits: a frontier comparator, hosted via Z.ai or the `glm-5.2` Ollama cloud tag, not a local recommendation. If you host GLM-5.1 today for long-horizon agentic work, 5.2 is the obvious upgrade to evaluate — the 1M context is the reason to bother. Local picks are unchanged: Qwen3-Coder-30B-A3B, North Mini Code, and Qwen 3.5 35B-A3B at 24 GB+.

Where this fits

Models: GLM-5.1 · Kimi K2.6 · DeepSeek V4-Pro · Qwen3-Coder-30B-A3B

Hardware: NVIDIA DGX Spark · Dual RTX 5090 · Mac Studio M3 Ultra 96 GB

Sources

Next step

Try this in the planner→