the AI bench
VERIFIED JUNE 2026
All models

MODEL · GOOGLE · 31B DENSE / 26B TOTAL + 3.8B ACTIVE (MOE)

Gemma 4 (31B dense + 26B A4B MoE)

Google's April 2026 refresh — Arena top 5 in its first week, 256K context native, vision + audio multimodal. Big news: Gemma 4 moved to Apache 2.0 from the custom Gemma Terms. The current Apache-2.0 "best dense under 70B" pick.

License: Apache 2.0 (moved off Gemma Terms) · Context: 256K · Released: April 2, 2026

The decision in five lines

The call
Buy — for chat
Best for
chat · docs
Runs on
16 hardware picks fit (cheapest: Minisforum UM890 Pro · $463)
Watch out
Tight VRAM budgets under 16 GB — even Gemma 4 26B MoE wants 15 GB at Q4.
Evidence
Estimated · last verified April 2026

31B dense
PARAMETERS
DENSE + MOE
TYPE
256K
CONTEXT
~18 GB (31B dense) / ~15 GB (26B MoE)
VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

CHAT · TOP
Gemma 4 31B DenseGoogle April 2 2026 release; Arena top 5, 256K context, vision+audio native; Apache 2.0.
CHAT · HIGH
Gemma 4 26B MoE (3.8B active)Open Arena top 10 at 3.8B active compute; calm and fast.
DOCS · TOP
Gemma 4 31B (256K context)256K context with vision+audio; calmer long-context behaviour than the 35B-A3B MoE on dense retrieval prompts.
DOCS · HIGH
Gemma 4 31B (256K context)31B dense with 256K context; Gemma commercial-permissive terms; Arena top 5.

The call

Google's April 2026 refresh — Arena top 5 in its first week, 256K context native, vision + audio multimodal. Big news: Gemma 4 moved to Apache 2.0 from the custom Gemma Terms. The current Apache-2.0 "best dense under 70B" pick.

When not to use: Tight VRAM budgets under 16 GB — even Gemma 4 26B MoE wants 15 GB at Q4. For those budgets, Qwen 3.5 9B fits better.

Runner notes

Ollama tags `gemma4:31b` and `gemma4:26b`. Ollama may lag on the audio modality path — use llama.cpp head for full multimodal. MoE routing overhead can hurt vLLM concurrency vs dense equivalents under heavy batching.

License
Apache 2.0 (moved off Gemma Terms)
Released
April 2, 2026
Maker
Google

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this