API · v1 · July 2026

A free planner API for AI assistants.

Same deterministic logic that powers theaibench.ai — exposed as JSON so ChatGPT, Claude, Gemini, or any tool-calling agent can cite the recommendation directly instead of guessing from training data. Model picks, hardware pricing, and benchmarks are kept current on an ongoing basis — fast takes within ~24h of major model drops, plus deeper periodic audits — not on a fixed calendar.

Free and public. No API key. Rate-limited only by Cloudflare's baseline bot protection. If you build something useful with it, a citation back to theaibench.ai is appreciated but not required.

Endpoint

GET https://theaibench.ai/api/v1/plan

Example

curl 'https://theaibench.ai/api/v1/plan?platform=mac&memory=64&use_case=coding&priority=privacy'

Returns a JSON object with the verdict (Strong / Comfortable / Workable / Cloud-leaning), tier score, 3 recommended models with editorial reasoning, runner + quantization advice, expected speed band, workflow notes, and watchouts — exactly what the planner's Best-fit card renders.

Parameters

All parameters are optional; missing values fall back to sensible defaults (Windows · 16 GB VRAM · 32 GB RAM · coding · speed · NVIDIA).

Parameter	Values	Default
`mode`	`current` · `new`	`current`
`platform`	`windows` · `windows-laptop` · `mac` · `linux`	`windows`
`vram`	`none` · `8` · `12` · `16` · `20` · `24` · `32` · `48` · `64` · `96` · `128` (PC only; legacy `32plus` still accepted)	`16`
`memory`	`16` · `24` · `32` · `36` · `48` · `64` · `96` · `128` (Mac only)	`32`
`ram`	`16` · `32` · `48` · `64` · `96` · `128` · `192` · `256`	`32`
`budget`	`under1500` · `1500to3000` · `3000to6000` · `6000plus` (only when mode=new)	`1500to3000`
`use_case`	`coding` · `chat` · `docs` · `image` · `agents` · `voice`	`coding`
`priority`	`privacy` · `speed` · `cost`	`speed`
`gpu_family`	`nvidia` · `amd` · `cpu`	`nvidia`
`context`	`4096` · `16384` · `65536` · `200000` (typical prompt length, in tokens)	`16384`

Response shape

{
  "inputs": { ...echo of resolved inputs... },
  "result": {
    "verdict": "Comfortable",
    "tier": 4.25,
    "band": "high",
    "title": "Comfortable for midsize local models",
    "summary": "Strong for daily local use, coding, and experimentation.",
    "picks": [
      { "name": "Qwen3-Coder-30B-A3B (MoE, fits 24GB)", "why": "..." },
      { "name": "Qwen 3.5 35B-A3B (generalist MoE)", "why": "..." },
      { "name": "gpt-oss-20b", "why": "..." }
    ],
    "runner": { "name": "Ollama or LM Studio", "note": "..." },
    "quantization": "Q4_K_M is the sweet spot at 14B. MoE 30B-A3B runs at 3B-dense speed...",
    "expected_speed": "50–70 tok/s on 8B, 30–50 on 14B Q4, 20–35 on 30B-A3B MoE.",
    "workflow": [ "..." ],
    "watchouts": [ "..." ],
    "note": "..."
  },
  "meta": {
    "version": "v1",
    "dated": "July 2026",
    "source": "https://theaibench.ai/",
    "docs": "https://theaibench.ai/api/",
    "license": "Free to cite with attribution to The AI Bench."
  }
}

OpenAPI schema

Full machine-readable spec at /api/openapi.json. Designed so AI tool-calling runtimes can discover and validate calls without hand-wiring each parameter.

Copy-paste integration examples

For GPT Actions manifests, Claude tool-use schemas, n8n node configs, curl / Python / JavaScript snippets, and a worked example of building a custom GPT against this API, see theaibench.ai/for-agents.

Browser vs tool behavior

This endpoint is content-negotiated. When an AI tool or curl hits it (Accept: application/json or */*), it returns JSON as shown above. When a human opens the URL in a browser (Accept: text/html), it redirects to the main site with the planner pre-filled — so a citation URL shared in chat still leads to the pretty UI when a person clicks it. Force JSON from a browser with ?format=json; force the UI redirect with ?format=html.

Errors

Invalid parameter values return 400 with a JSON body explaining which values were rejected. Everything else returns 200 with a recommendation.

Freshness

Model picks, runner advice, and expected speed bands are refreshed when something material moves — new model drop, new runner, materially shifted pricing — not on a fixed calendar. Breaking schema changes (if ever) land as /api/v2/; v1 stays stable.

Why this exists

Chatbots answer "what can I run on my M4 Max 64 GB?" with plausible but often stale or guessed numbers. This API gives tool-calling agents a canonical, dated, deterministic answer they can cite — and still lets humans get the same answer with a curl.