Qwen-AgentWorld — an open, consumer-runnable model for simulating agent environments

Qwen released Qwen-AgentWorld-35B-A3B on June 22 — Apache 2.0, built on Qwen3.5-35B-A3B-Base (35B total / 3B active MoE), and unusual: it is a "language world model" that predicts the next environment state given an agent's action, across seven interaction domains (MCP/tool-calling, Search, Terminal, SWE, Android, Web, OS). It is consumer-runnable — 3B active, GGUF and MLX quants already exist — and it picked up real traction fast (~28K downloads in days). But it is a simulator for building and testing agents, not a general chat or coding model, so it is a fast take, not a planner pick.

Verdict: The first open agent "world model" — Apache 2.0 and consumer-runnable, but a specialist environment simulator, not a chat or coding pick

The take

The facts, verified against the Hugging Face card (`Qwen/Qwen-AgentWorld-35B-A3B`, created 2026-06-22, Apache 2.0): it is the first single model to cover seven agent interaction domains, trained as a native world model (environment modeling from the continued-pretraining stage onward, not bolted on afterward). The pitch is zero-shot generalization to out-of-distribution environments plus controllable perturbations and fictional-world construction — useful as a scalable simulator for agent training and evaluation. There is a technical report and a GitHub repo; it runs under transformers/vLLM/SGLang and at 35B-A3B fits a single 24–32 GB rig at 4-bit.

Why it matters: it is the one genuinely-new, Apache-licensed, consumer-runnable open-weight model that landed cleanly in the window, and it is directly relevant to the people building agentic tooling — the audience our /for-agents and agents use-case pages speak to. Simulating an environment cheaply (so you can stress-test an agent without hitting real terminals, APIs, or devices) is a real need, and an open model that does it is a notable first.

Our call: no planner-pick change and no model detail page — this is a specialist research/tooling model, and listing it next to chat/coding picks would overstate its relevance. We track it as a fast take so the currency is on record. If you are building or evaluating agents, it is worth a look as a simulation backend; if you want a model to actually do coding or chat locally, the picks are unchanged (Qwen3-Coder-30B-A3B, North Mini Code, Qwen 3.5 35B-A3B at 24 GB+).

Where this fits

Models: Qwen 3.5 35B-A3B · Qwen3-Coder-30B-A3B · North Mini Code (30B-A3B) · GLM-5.1

Hardware: NVIDIA RTX 5090 · NVIDIA RTX 4090 · Mac Studio M4 Max 64 GB

Sources

Next step

Try this in the planner→