Gemini 3.5 Flash — Google's new daily-driver API price-shifts the cloud-vs-local math

Google launched Gemini 3.5 Flash GA at I/O 2026 (May 19) at $1.50/$9.00 per 1M tokens with 1M context. Reported to beat Gemini 3.1 Pro on Terminal-Bench 2.1 / MCP Atlas / CharXiv at roughly 60% of Pro's blended cost. Alongside it: a tiered consumer subscription restructure — new $100/mo Google AI Ultra tier (5× Pro limits, Antigravity 2.0 access) and the top Ultra cut from $250 → $200.

Verdict: Beats Gemini 3.1 Pro on Terminal-Bench at $1.50 / $9.00 per 1M — the new daily-driver mid-tier API

The take

The pricing matters for our cost calculator. Gemini 3.5 Flash at $1.50 in / $9.00 out (avg $5.25 per 1M at 50/50 split) sits between Gemini 3.1 Flash-Lite ($0.25/$1.50, avg $0.875) and Gemini 3.1 Pro ($2.00/$12.00, avg $7.00 at ≤200K). For the volume sweet spot most agents and coding workflows hit — mid-quality, long-running, mostly-input — Flash 3.5 is now the cheapest current-generation Gemini that handles 1M context. We've added it to /api/v1/plan and the cost-calculator dropdown.

The Google AI Ultra restructure is the other half. Quote from Google's announcement blog: "We're launching a $100/month AI Ultra plan... We're also reducing the monthly price of our top-tier AI Ultra plan from $250 to $200." The $100 tier sits at the same price point as ChatGPT Pro Mid and Claude Max 5x; the $200 top-Ultra mirrors ChatGPT Pro 200 and Claude Max 20x. Google now has the same three-step subscription ladder ($20 Pro / $100 Ultra / $200 Ultra) as OpenAI and Anthropic — a meaningful consolidation of the cloud-comparison story.

What's bundled: $100/mo Ultra gets you 5× Pro usage limits in the Gemini app, access to Gemini 3.5 Flash, Google Antigravity 2.0 (the new agentic-coding IDE Google launched at I/O 2026), 20 TB Google One storage, and YouTube Premium. The $200 tier scales those limits 20× and adds the top compute pool for Gemini 3.5 Pro when it lands next month. Antigravity 2.0 is a notable inclusion — agentic IDEs are normally a separate paid product.

What this doesn't change: local-AI buyers in the 24-128 GB hardware band still beat cloud subs on heavy-volume usage (>50M tokens/month). What it does change is the break-even math for medium-volume users at the $100-200/mo tier — Google's bundling is now meaningfully more competitive than ChatGPT Pro Mid for users who actually want the storage + YouTube + Antigravity stack rather than pure model access. We've left the existing planner picks unchanged but the cost calculator now surfaces the trade-off cleanly.

Where this fits

Hardware: NVIDIA RTX 5090 · NVIDIA RTX 4090 · Mac Studio M4 Max 64 GB

Sources

Next step

Try this in the planner→