API · v1 · April 2026
A free planner API for AI assistants.
Same deterministic logic that powers theaibench.ai — exposed as JSON so ChatGPT, Claude, Gemini, or any tool-calling agent can cite the recommendation directly instead of guessing from training data. Model picks, hardware pricing, and benchmarks are web-verified quarterly.
Free and public. No API key. Rate-limited only by Cloudflare's baseline bot protection. If you build something useful with it, a citation back to theaibench.ai is appreciated but not required.
Endpoint
GET https://theaibench.ai/api/v1/plan
Example
curl 'https://theaibench.ai/api/v1/plan?platform=mac&memory=64&use_case=coding&priority=privacy'
Returns a JSON object with the verdict (Strong / Comfortable / Workable / Cloud-leaning), tier score, 3 recommended models with editorial reasoning, runner + quantization advice, expected speed band, workflow notes, and watchouts — exactly what the planner's Best-fit card renders.
Parameters
All parameters are optional; missing values fall back to sensible defaults (Windows · 16 GB VRAM · 32 GB RAM · coding · speed · NVIDIA).
| Parameter | Values | Default |
|---|---|---|
mode | current · new | current |
platform | windows · windows-laptop · mac · linux | windows |
vram | none · 8 · 12 · 16 · 24 · 32plus | 16 |
memory | 16 · 24 · 32 · 64 · 96plus (Mac only) | 32 |
ram | 16 · 32 · 64 · 128plus | 32 |
budget | under1500 · 1500to3000 · 3000to6000 · 6000plus (only when mode=new) | 1500to3000 |
use_case | coding · chat · docs · image · agents · voice | coding |
priority | privacy · speed · cost | speed |
gpu_family | nvidia · amd · cpu | nvidia |
Response shape
{
"inputs": { ...echo of resolved inputs... },
"result": {
"verdict": "Comfortable",
"tier": 4.25,
"band": "high",
"title": "Comfortable for midsize local models",
"summary": "Strong for daily local use, coding, and experimentation.",
"picks": [
{ "name": "Qwen3-Coder-30B-A3B (MoE, fits 24GB)", "why": "..." },
{ "name": "Qwen 3.5 35B-A3B (generalist MoE)", "why": "..." },
{ "name": "gpt-oss-20b", "why": "..." }
],
"runner": { "name": "Ollama or LM Studio", "note": "..." },
"quantization": "Q4_K_M is the sweet spot at 14B. MoE 30B-A3B runs at 3B-dense speed...",
"expected_speed": "50–70 tok/s on 8B, 30–50 on 14B Q4, 20–35 on 30B-A3B MoE.",
"workflow": [ "..." ],
"watchouts": [ "..." ],
"note": "..."
},
"meta": {
"version": "v1",
"dated": "April 2026",
"source": "https://theaibench.ai/",
"docs": "https://theaibench.ai/api/",
"license": "Free to cite with attribution to The AI Bench."
}
}
OpenAPI schema
Full machine-readable spec at
/api/openapi.json.
Designed so AI tool-calling runtimes can discover and validate calls without hand-wiring each parameter.
Browser vs tool behavior
This endpoint is content-negotiated. When an AI tool or curl hits it
(Accept: application/json or */*), it returns JSON as shown above.
When a human opens the URL in a browser (Accept: text/html), it redirects to the
main site with the planner pre-filled — so a citation URL shared in chat still leads to the
pretty UI when a person clicks it. Force JSON from a browser with ?format=json;
force the UI redirect with ?format=html.
Errors
Invalid parameter values return 400 with a JSON body explaining which values were rejected. Everything else returns 200 with a recommendation.
Freshness
Model picks, runner advice, and expected speed bands are refreshed quarterly — next refresh 2026-07-17.
Breaking schema changes (if ever) land as /api/v2/; v1 stays stable.
Why this exists
Chatbots answer "what can I run on my M4 Max 64 GB?" with plausible but often stale or guessed numbers.
This API gives tool-calling agents a canonical, dated, deterministic answer they can cite — and still
lets humans get the same answer with a curl.