HARDWARE · TOP TIER · 32 GB
NVIDIA RTX 5090
The single-GPU answer for people who refuse to wait.
A 32 GB Blackwell card that runs every modern coding, chat, and agent model at Q4 with headroom, at speeds a used dual-3090 rig can match only with a power bill and a compromise. AIB allocation has thawed enough that the entry floor came back down to ~$2,910 in late May.
The decision in five lines
- The call
- Buy — The single-GPU answer for people who refuse to wait.
- Best for
- Top tier
- Runs well
- Qwen3-Coder-30B-A3B (MoE, fits 24GB) · Qwen 3.6-27B · Z-Image-Turbo (Apache 2.0)
- Watch out
- Street price still sits well above $1,999 MSRP but AIB allocation has thawed. Floor moved $3,000 (April) → $3,799 (early May) → $3,799–$4,299 (May 22) → $2,910–$4,300 (May 28, ASUS TUF AIBs landing at $2,909.99, Gigabyte/MSI Gaming OC at ~$3,299). DRAM shortage is still squeezing supply industry-wide; IDC pegs the shortage through Q4 2027.
- Evidence
- Measured
- 32
- GB GDDR7
- 1,792
- GB/S BANDWIDTH
- 575
- W TDP
- ~$2,910
- FROM (STREET)
What fits at this tier
Unlocks the top band across every use case. MoE 30B-A3B and 35B-A3B at Q4 with room for 32K+ context; 32B dense Q4 fits with headroom; FLUX.2 dev fits without quant pain. 70B dense Q4 (~40 GB) does NOT fit 32 GB — needs 48 GB+ (dual-3090 or dual-5090).
The call
Buy it if you want one card that runs every modern MoE at Q4 without thinking about quant or context.
Skip it if you can stomach used hardware — a pair of RTX 3090s gives you 48 GB for roughly $1,600 all-in, at the cost of noise, heat, and PCIe splitting.
Watchouts
- Street price still sits well above $1,999 MSRP but AIB allocation has thawed. Floor moved $3,000 (April) → $3,799 (early May) → $3,799–$4,299 (May 22) → $2,910–$4,300 (May 28, ASUS TUF AIBs landing at $2,909.99, Gigabyte/MSI Gaming OC at ~$3,299). DRAM shortage is still squeezing supply industry-wide; IDC pegs the shortage through Q4 2027.
- 575 W TDP needs a 1,000 W+ PSU and honest cooling. Check case airflow and PSU rail headroom before buying.
- 12VHPWR connector — follow Nvidia's re-seat guidance. Melted-cable stories are rare but real at this power draw.
- FE cards run hot under sustained load. AIB partner cards (ASUS, MSI, Gigabyte) with larger heatsinks are worth the $100–$200 premium if you'll keep it loaded for hours.
Local vs cloud at this tier
● LOCAL WINS
Unrestricted context, fully offline, no per-token cost for heavy daily use (>50M tok/mo), and you own the hardware for 3–5 years.
● CLOUD WINS
Frontier reasoning (Claude Opus 4.8, GPT-5.4), first-day access to new models, zero upfront cost, and no electricity or cooling to think about.
At this tier, the break-even vs a $100/mo ChatGPT Pro plan is ~24–30 months at regular usage. Heavy API users break even in under a year.
Next step
Load this setup into the planner→