the AI bench
VERIFIED JUNE 2026

HARDWARE · 23 CURATED PICKS

Hardware we'd actually buy to run local AI.

Curated, opinionated, dated. Every pick reviewed quarterly against real street prices and current model weight classes. No affiliate links. No listicle bloat. Just the hardware that solves a specific problem — and an honest take on who should pass.


Frontier tier — 48 GB+ serious rigs

FRONTIER CONSUMER64 GB (2×32)

Dual RTX 5090

Two RTX 5090s with 2× 1,792 GB/s bandwidth and 64 GB total VRAM. This is the first consumer configuration that fits 122B-A10B MoE with room AND generates tokens fast enough to use interactively. The tradeoff is 1,500 W sustained draw, dual 12VHPWR connectors, and a case that fits 9-slot cards side-by-side.

$8,500–$10,500EstimatedRead →
PROSUMER SINGLE-CARD48 GB ECC

NVIDIA RTX A6000 (48 GB, used)

The only consumer-reachable single-card path to 48 GB VRAM under $5,000. Ampere-generation workstation silicon with ECC memory, 768 GB/s bandwidth, and a dual-slot blower that tolerates sustained load. Used-market prices make it an Ada-tax-avoidance play.

$3,500–$4,500EstimatedRead →
AI WORKSTATION128 GB unified

NVIDIA DGX Spark

128 GB of unified memory via NVIDIA's GB10 Grace Blackwell Superchip — 4× what any consumer GPU gives you. The catch: 273 GB/s bandwidth is ~27% of an RTX 4090, so you trade raw speed for fit. A capacity-first machine, not a speed-first machine.

$4,699MeasuredRead →
SILENT WORKSTATION96 GB unified

Mac Studio M3 Ultra 96 GB

819 GB/s unified memory bandwidth — the highest in any shipping Mac — plus 96 GB capacity puts Llama 3.3 70B Q4 at 12–18 tok/s on a box that runs at ~70 W idle and fits on a bookshelf. Dual M3 Max dies under one heatsink, no GPU tower, no fan noise.

$3,999EstimatedRead →
SMART MONEY48 GB

Dual RTX 3090 (used)

Two used 3090s give you 48 GB of VRAM for roughly $1,600 all-in — enough for 70B dense at Q4 with room for context. llama.cpp and Ollama split across PCIe automatically; no NVLink needed. The compromise is noise, heat, and finding honest used cards.

$1,800–$2,500 all-inEstimatedRead →

Top tier — 32-64 GB new-gen

TOP TIER32 GB

NVIDIA RTX 5090

A 32 GB Blackwell card that runs every modern coding, chat, and agent model at Q4 with headroom, at speeds a used dual-3090 rig can match only with a power bill and a compromise. AIB allocation has thawed enough that the entry floor came back down to ~$2,910 in late May.

$2,910–$4,300MeasuredRead →
DESKTOP MAC64 GB unified

Mac Studio M4 Max 64 GB

64 GB unified memory at 546 GB/s. Runs 30B-A3B MoE at 70–100 tok/s silently at 6 W idle, Qwen 3.5 27B dense at ~20 tok/s, and FLUX.2 klein pipelines cleanly. 70B dense Q4 fits with a `sudo sysctl iogpu.wired_limit_mb` tweak at 8–15 tok/s — workable, not silent under sustained load. Previous-gen M4 now, and Bloomberg (April 19, 2026) reported the M5 Mac Studio refresh slipped to October 2026 — supply chain. Buy-now case is stronger than it was a week ago.

$3,199EstimatedRead →
ALL-ROUNDER · MAC64 GB unified

M5 Max MacBook Pro 64 GB

64 GB of unified memory at 614 GB/s on the 40-core GPU M5 Max. Runs every modern model up to 35B-A3B MoE at reasonable speed, in a silent chassis that sustains load on battery. The compromise: prefill on long prompts is noticeably slower than NVIDIA, and you pay Apple's storage tax to go beyond 48 GB.

$4,499MeasuredRead →
PORTABLE PRO48 GB unified

M5 Pro MacBook Pro 48 GB

48 GB unified at 307 GB/s — 44% more bandwidth than M4 Pro, enough to run Qwen 3.5 35B-A3B MoE at 70–90 tok/s on battery, in a laptop. The honest step-up from the Mac mini M4 Pro 24 GB without going to the $4,499 M5 Max 64 GB.

$2,599–$3,099EstimatedRead →
MOE MINI-PC128 GB unified

Framework Desktop (Ryzen AI Max+ 395)

Strix Halo's 40-CU Radeon 8060S iGPU plus 128 GB LPDDR5X unified memory runs Qwen 3 30B-A3B MoE at ~72 tok/s — 4× the bandwidth of the Minisforum UM890 Pro, 4× the memory. A genuine local-AI mini-PC, not a CPU box that happens to boot.

$1,999–$2,851EstimatedRead →

Smart money — 24 GB done right

PREVIOUS GEN24 GB

NVIDIA RTX 4090

Same 24 GB VRAM ceiling as the new generation's sweet spot, 1 TB/s bandwidth, mature CUDA stack, no 12VHPWR drama if you buy a unit with the updated 12V-2x6 connector. Buy used from a trusted seller — new retail at scalper prices is not the right move.

$2,200–$2,800MeasuredRead →
USED VALUE24 GB

NVIDIA RTX 3090 (used, single)

24 GB of GDDR6X at 936 GB/s for ~$1,050 on the used market in June 2026 — every dollar you spend on a 3090 still buys more usable VRAM than any other card in the lineup, even after the used-market floor lifted ~$200 since April as buyers priced out of 5090 scarcity moved a tier down. The tradeoff is age, heat, and a GDDR6X memory package that runs hot after half a decade.

$950–$1,200EstimatedRead →
TEAM RED24 GB

AMD Radeon RX 7900 XTX

24 GB at roughly 85–90% of a 4090's throughput under ROCm. The hardware is fine; the software ecosystem is the tax. Plan 5–10 hours on first-time ROCm setup, plus the ongoing friction of Ollama being patchy on AMD. New-market pricing has split sharply from used since the DRAM crunch — used 3090s and used 7900 XTXs are now the same $760 band.

$760 used / ~$1,500 newMeasuredRead →
MAC MID-TIER24 GB unified

Mac Mini M4 Pro 24 GB

At 273 GB/s — 2.3× the base Mac mini M4's bandwidth — the M4 Pro in its base 12-core CPU / 16-core GPU bin is the first Apple silicon SKU where 14B dense Q4 feels responsive, not ponderous. Silent, 4 W idle, $1,399 from Apple.

$1,399EstimatedRead →
FANLESS PORTABLE24 GB unified

MacBook Air M5 24 GB

The M5 Air at 24 GB is the first Apple laptop where 8B dense Q4 inference feels responsive without a fan ramping up — because there is no fan. 153 GB/s bandwidth is the honest limiting factor; this is not a 14B-comfortable machine.

$1,299–$1,699EstimatedRead →

Affordable entry — 12-16 GB and under

BLACKWELL SWEET SPOT16 GB

NVIDIA RTX 5070 Ti

16 GB GDDR7 at 896 GB/s — 93% of the 5080's bandwidth for ~15% less money at street price. Hardware Corner measured 185 tok/s on Qwen 2.5 14B Q4 short-context, which is the honest sweet spot for this card.

$980–$1,300EstimatedRead →
BLACKWELL MID-HIGH16 GB

NVIDIA RTX 5080

Blackwell architecture + GDDR7 at 960 GB/s buys you ~30–40% more tok/s than the 5060 Ti 16 GB, but the VRAM ceiling is identical. If your work lives in the 8B–14B dense band, this is the honest Blackwell pick; if you need 30B-A3B MoE with headroom, you need more memory.

$999–$1,400EstimatedRead →
RDNA4 ENTRY16 GB

AMD Radeon RX 9070 XT

16 GB GDDR6 at 640 GB/s on AMD's first RDNA4 architecture. AI throughput per compute unit doubled vs RDNA3 — paired with proper ROCm 7+ this is the AMD card to buy if you're entering local AI on Team Red today. Pairs cleanly with the new tooling story; the 7900 XTX retains a 24 GB lead but commands scarcity premiums on the new market.

$649–$779EstimatedRead →
BUDGET16 GB

RTX 5060 Ti 16 GB

16 GB GDDR7 at $559 Amazon. Runs 14B dense at Q4 at ~33 tok/s with room for 16K context; 30B-A3B MoE fits cleanly at Q3 (~13 GB), or at Q4 (~17 GB) with partial CPU offload. The honest entry point for local AI if you want new hardware with a warranty.

$560–$610MeasuredRead →
ENTRY MAC16 GB unified

Mac Mini M4 16 GB

Apple discontinued the $599 Mac mini base config on May 1, 2026 and raised the floor to $799 with 512 GB. The 16 GB / 256 GB SKU only survives on Amazon residuals and eBay. If you can find one near $499, the 8B-class story still holds; otherwise the math has shifted toward the 24 GB M4 Pro.

$799 (new floor) / $499–$599 (eBay/residuals)MeasuredRead →
BUDGET ENTRY12 GB

NVIDIA RTX 3060 12 GB

A 5-year-old card that still runs Llama 3.1 8B Q4 at 52 tok/s. Amazon street is ~$354, eBay used floors around $230, and NVIDIA is restarting 8 nm production with Samsung in June 2026 — supply should ease through summer. 12 GB is the minimum VRAM that matters for 8B-class models with any context, and CUDA just works everywhere.

$280–$400EstimatedRead →

Cautionary picks — read the watchouts first

Specific rig checks

Comparing a specific rig? Start with Mac Studio M3 Ultra 96 GB, M5 Max MacBook Pro 64 GB, RTX 5060 Ti 16 GB, and the local LLM benchmark table.

How we pick

Every pick has to answer one question — "is this the honest best answer at this price point for someone running local AI?" We verify prices against Newegg, Amazon, BestValueGPU, and eBay sold-listings each quarter. We update or retire picks the day a new model or a price swing makes them wrong. No sponsored entries. No affiliate links under Path 2.

Picks are updated as the market shifts — when a new GPU lands, a price moves materially, or a model release changes the fit calculus.