MiniCPM-V-4.6 (1B vision-language)

Tiny vision-language model — single-image, multi-image, and video understanding from a 1B-class checkpoint. Mixed 4×/16× visual token compression cuts visual-encoding FLOPs >50% vs prior MiniCPM-V revs. Tool / function-calling built in. Artificial Analysis Intelligence Index 13 beats raw Qwen3.5-0.8B (10) at ~19× lower token cost. The newest entry in the V (vision-only) branch — parallel to the omnimodal MiniCPM-o line.

License: Apache 2.0 · Context: inherits Qwen3.5-0.8B (128K) · Released: May 15, 2026

The decision in five lines

The call: Consider — runnable locally, family reference
Best for: Local evaluation and family reference
Runs on: 23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out: You need speech I/O — MiniCPM-V is vision-only by design; reach for MiniCPM-o 2.6 if you want voice in/out too.
Evidence: Estimated · last verified May 2026

1B (SigLIP2-400M vision encoder + Qwen3.5-0.8B LLM): PARAMETERS
VISION-LANGUAGE: TYPE
inherits: CONTEXT
~1.5–2 GB (BNB/AWQ/GPTQ int4) / ~3–4 GB (FP16): VRAM AT Q4

Where we recommend this

This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.

The call

Tiny vision-language model — single-image, multi-image, and video understanding from a 1B-class checkpoint. Mixed 4×/16× visual token compression cuts visual-encoding FLOPs >50% vs prior MiniCPM-V revs. Tool / function-calling built in. Artificial Analysis Intelligence Index 13 beats raw Qwen3.5-0.8B (10) at ~19× lower token cost. The newest entry in the V (vision-only) branch — parallel to the omnimodal MiniCPM-o line.
When not to use: You need speech I/O — MiniCPM-V is vision-only by design; reach for MiniCPM-o 2.6 if you want voice in/out too. Or you need frontier vision quality on a long doc workflow — larger multimodal frontier picks still outperform on dense OCR + reasoning.

Runner notes

GGUF (`openbmb/MiniCPM-V-4.6-gguf` ~0.8B), BNB / AWQ / GPTQ quants all published day-one. Thinking variant (`MiniCPM-V-4.6-Thinking`) for chain-of-thought reasoning over images. llama.cpp / Ollama paths work; Flash Attention 2 recommended for multi-image / video. Sits beside (not replacing) MiniCPM-o 2.6 — pick V for pure vision, o for omni.

License: Apache 2.0
Released: May 15, 2026
Maker: OpenBMB
Model card: huggingface.co/openbmb/MiniCPM-V-4.6 →

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this→