← The AI Bench

API · v1 · April 2026

A free planner API for AI assistants.

Same deterministic logic that powers theaibench.ai — exposed as JSON so ChatGPT, Claude, Gemini, or any tool-calling agent can cite the recommendation directly instead of guessing from training data. Model picks, hardware pricing, and benchmarks are web-verified quarterly.

Free and public. No API key. Rate-limited only by Cloudflare's baseline bot protection. If you build something useful with it, a citation back to theaibench.ai is appreciated but not required.

Endpoint

GET https://theaibench.ai/api/v1/plan

Example

curl 'https://theaibench.ai/api/v1/plan?platform=mac&memory=64&use_case=coding&priority=privacy'

Returns a JSON object with the verdict (Strong / Comfortable / Workable / Cloud-leaning), tier score, 3 recommended models with editorial reasoning, runner + quantization advice, expected speed band, workflow notes, and watchouts — exactly what the planner's Best-fit card renders.

Parameters

All parameters are optional; missing values fall back to sensible defaults (Windows · 16 GB VRAM · 32 GB RAM · coding · speed · NVIDIA).

ParameterValuesDefault
modecurrent · newcurrent
platformwindows · windows-laptop · mac · linuxwindows
vramnone · 8 · 12 · 16 · 24 · 32plus16
memory16 · 24 · 32 · 64 · 96plus (Mac only)32
ram16 · 32 · 64 · 128plus32
budgetunder1500 · 1500to3000 · 3000to6000 · 6000plus (only when mode=new)1500to3000
use_casecoding · chat · docs · image · agents · voicecoding
priorityprivacy · speed · costspeed
gpu_familynvidia · amd · cpunvidia

Response shape

{
  "inputs": { ...echo of resolved inputs... },
  "result": {
    "verdict": "Comfortable",
    "tier": 4.25,
    "band": "high",
    "title": "Comfortable for midsize local models",
    "summary": "Strong for daily local use, coding, and experimentation.",
    "picks": [
      { "name": "Qwen3-Coder-30B-A3B (MoE, fits 24GB)", "why": "..." },
      { "name": "Qwen 3.5 35B-A3B (generalist MoE)", "why": "..." },
      { "name": "gpt-oss-20b", "why": "..." }
    ],
    "runner": { "name": "Ollama or LM Studio", "note": "..." },
    "quantization": "Q4_K_M is the sweet spot at 14B. MoE 30B-A3B runs at 3B-dense speed...",
    "expected_speed": "50–70 tok/s on 8B, 30–50 on 14B Q4, 20–35 on 30B-A3B MoE.",
    "workflow": [ "..." ],
    "watchouts": [ "..." ],
    "note": "..."
  },
  "meta": {
    "version": "v1",
    "dated": "April 2026",
    "source": "https://theaibench.ai/",
    "docs": "https://theaibench.ai/api/",
    "license": "Free to cite with attribution to The AI Bench."
  }
}

OpenAPI schema

Full machine-readable spec at /api/openapi.json. Designed so AI tool-calling runtimes can discover and validate calls without hand-wiring each parameter.

Browser vs tool behavior

This endpoint is content-negotiated. When an AI tool or curl hits it (Accept: application/json or */*), it returns JSON as shown above. When a human opens the URL in a browser (Accept: text/html), it redirects to the main site with the planner pre-filled — so a citation URL shared in chat still leads to the pretty UI when a person clicks it. Force JSON from a browser with ?format=json; force the UI redirect with ?format=html.

Errors

Invalid parameter values return 400 with a JSON body explaining which values were rejected. Everything else returns 200 with a recommendation.

Freshness

Model picks, runner advice, and expected speed bands are refreshed quarterly — next refresh 2026-07-17. Breaking schema changes (if ever) land as /api/v2/; v1 stays stable.

Why this exists

Chatbots answer "what can I run on my M4 Max 64 GB?" with plausible but often stale or guessed numbers. This API gives tool-calling agents a canonical, dated, deterministic answer they can cite — and still lets humans get the same answer with a curl.