MODEL · OPENBMB · 1B (SIGLIP2-400M VISION ENCODER + QWEN3.5-0.8B LLM)
MiniCPM-V-4.6 (1B vision-language)
Tiny vision-language model — single-image, multi-image, and video understanding from a 1B-class checkpoint. Mixed 4×/16× visual token compression cuts visual-encoding FLOPs >50% vs prior MiniCPM-V revs. Tool / function-calling built in. Artificial Analysis Intelligence Index 13 beats raw Qwen3.5-0.8B (10) at ~19× lower token cost. The newest entry in the V (vision-only) branch — parallel to the omnimodal MiniCPM-o line.
License: Apache 2.0 · Context: inherits Qwen3.5-0.8B (128K) · Released: May 15, 2026
The decision in five lines
- The call
- Consider — runnable locally, family reference
- Best for
- Local evaluation and family reference
- Runs on
- 23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
- Watch out
- You need speech I/O — MiniCPM-V is vision-only by design; reach for MiniCPM-o 2.6 if you want voice in/out too.
- Evidence
- Estimated
- 1B (SigLIP2-400M vision encoder + Qwen3.5-0.8B LLM)
- PARAMETERS
- VISION-LANGUAGE
- TYPE
- inherits
- CONTEXT
- ~1.5–2 GB (BNB/AWQ/GPTQ int4) / ~3–4 GB (FP16)
- VRAM AT Q4
Where we recommend this
This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.
The call
Tiny vision-language model — single-image, multi-image, and video understanding from a 1B-class checkpoint. Mixed 4×/16× visual token compression cuts visual-encoding FLOPs >50% vs prior MiniCPM-V revs. Tool / function-calling built in. Artificial Analysis Intelligence Index 13 beats raw Qwen3.5-0.8B (10) at ~19× lower token cost. The newest entry in the V (vision-only) branch — parallel to the omnimodal MiniCPM-o line.
When not to use: You need speech I/O — MiniCPM-V is vision-only by design; reach for MiniCPM-o 2.6 if you want voice in/out too. Or you need frontier vision quality on a long doc workflow — larger multimodal frontier picks still outperform on dense OCR + reasoning.
Runner notes
GGUF (`openbmb/MiniCPM-V-4.6-gguf` ~0.8B), BNB / AWQ / GPTQ quants all published day-one. Thinking variant (`MiniCPM-V-4.6-Thinking`) for chain-of-thought reasoning over images. llama.cpp / Ollama paths work; Flash Attention 2 recommended for multi-image / video. Sits beside (not replacing) MiniCPM-o 2.6 — pick V for pure vision, o for omni.
Hardware that fits
Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.
- Intel Arc B580 12 GBPerfect · 4.7× 12 GB · $249–$299
- NVIDIA RTX 3060 12 GBPerfect · 4.7× 12 GB · $250–$340
- Minisforum UM890 ProPerfect · 9.4× 32 GB DDR5 (shared) · $463–$580 all-in
- RTX 5060 Ti 16 GBPerfect · 6.3× 16 GB · $560–$610
- AMD Radeon RX 9070 XTPerfect · 6.3× 16 GB · $649–$849
- AMD Radeon RX 7900 XTXPerfect · 9.4× 24 GB · $770–$1,400
- Mac Mini M4 16 GBPerfect · 4.2× 16 GB unified · $799 (new floor) / $499–$599 (eBay/residuals)
- NVIDIA RTX 3090 (used, single)Perfect · 9.4× 24 GB · $800–$1,000
- NVIDIA RTX 5070 TiPerfect · 6.3× 16 GB · $980–$1,300
- NVIDIA RTX 5080Perfect · 6.3× 16 GB · $999–$1,250
- MacBook Air M5 24 GBPerfect · 6.3× 24 GB unified · $1,299–$1,699
- Mac Mini M4 Pro 24 GBPerfect · 6.3× 24 GB unified · $1,399
- Dual RTX 3090 (used)Perfect · 18.8× 48 GB · $1,800–$2,500 all-in
- Framework Desktop (Ryzen AI Max+ 395)Perfect · 33.5× 128 GB unified · $1,999–$2,851
- NVIDIA RTX 4090Perfect · 9.4× 24 GB · $2,200–$2,800
- M5 Pro MacBook Pro 48 GBPerfect · 12.6× 48 GB unified · $2,599–$3,099
- Mac Studio M4 Max 64 GBPerfect · 16.8× 64 GB unified · $3,199
- NVIDIA RTX A6000 (48 GB, used)Perfect · 18.8× 48 GB ECC · $3,500–$4,500
- NVIDIA RTX 5090Perfect · 12.5× 32 GB · $3,800–$4,100
- Mac Studio M3 Ultra 96 GBPerfect · 25.2× 96 GB unified · $3,999
- M5 Max MacBook Pro 64 GBPerfect · 16.8× 64 GB unified · $4,499
- NVIDIA DGX SparkPerfect · 33.5× 128 GB unified · $4,699
- Dual RTX 5090Perfect · 25.0× 64 GB (2×32) · $8,500–$10,500
Next step
Find-by-model — see what hardware runs this→