the AI bench
VERIFIED JUNE 2026
All models

MODEL · DEEPSEEK · 1.6T TOTAL / 49B ACTIVE (MOE)

DeepSeek V4-Pro

DeepSeek's frontier-class V4 flagship — 1.6T MoE that matches GPT-5.4 and Sonnet 4.6 on most benchmarks at meaningfully lower hosted price. The 1M-context default uses ~27% of V3.2's single-token FLOPs and ~10% of its KV cache thanks to architecture changes. MIT-licensed, but not realistically a local pick at this size.

License: MIT · Context: 1M tokens (384K max output) · Released: April 24, 2026 (preview)

The decision in five lines

The call
Hosted only
Best for
Hosted reference and benchmarks
Runs on
Hosted or workstation-class only · ~800 GB+ (not consumer-local; hosted only realistic)
Watch out
Use the DeepSeek API or a hosted route — Together, OpenRouter, or DeepSeek's own endpoints — and let the smaller V4-Flash sibling fill multi-GPU local roles.
Evidence
Estimated · last verified April 2026

1.6T total
PARAMETERS
MOE
TYPE
1M
CONTEXT
~800 GB+ (not consumer-local; hosted only realistic)
VRAM AT Q4

Where we recommend this

This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.

The call

DeepSeek's frontier-class V4 flagship — 1.6T MoE that matches GPT-5.4 and Sonnet 4.6 on most benchmarks at meaningfully lower hosted price. The 1M-context default uses ~27% of V3.2's single-token FLOPs and ~10% of its KV cache thanks to architecture changes. MIT-licensed, but not realistically a local pick at this size.

When not to use: Local hardware budgets under 8× H100 (~$200K). At Q4 the weights alone exceed 800 GB. Use the DeepSeek API or a hosted route — Together, OpenRouter, or DeepSeek's own endpoints — and let the smaller V4-Flash sibling fill multi-GPU local roles.

Runner notes

Hosted via DeepSeek API at materially lower prices than GPT-5.4 / Sonnet 4.6. Unsloth dynamic GGUFs exist (`unsloth/DeepSeek-V4-Pro`) but the practical home is API. 1M context is the default across all official services.

License
MIT
Released
April 24, 2026 (preview)
Maker
DeepSeek

Next step

Find-by-model — see what hardware runs this