MiniCPM-o 4.5

OpenBMB's current omnimodal flagship, superseding MiniCPM-o 2.6 — vision, speech-in, speech-out in one ~9B model now built on the Qwen3-8B backbone. Adds full-duplex live streaming (input and output don't block each other) and proactive interaction, and OpenBMB reports it matching Gemini 2.5 Flash on vision/speech. Apache 2.0.

License: Apache 2.0 · Context: Inherits Qwen3-8B base · Released: February 3, 2026

The decision in five lines

The call: Skip for local — for voice
Best for: voice
Runs on: 23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out: Pure TTS or pure STT — the full multimodal stack is overkill.
Evidence: Estimated · last verified July 2026

~9B (built on Qwen3-8B + vision: PARAMETERS
MULTIMODAL OMNI: TYPE
Inherits: CONTEXT
~9–10 GB (int4) / ~18 GB (FP16): VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

VOICE · LOW

MiniCPM-o 4.5 (int4)Apache 2.0; ~9B omnimodal on a Qwen3-8B backbone at ~9–10GB VRAM (int4); full-duplex live streaming voice + vision on laptop GPUs. Supersedes the 2.6 generation.

The call

OpenBMB's current omnimodal flagship, superseding MiniCPM-o 2.6 — vision, speech-in, speech-out in one ~9B model now built on the Qwen3-8B backbone. Adds full-duplex live streaming (input and output don't block each other) and proactive interaction, and OpenBMB reports it matching Gemini 2.5 Flash on vision/speech. Apache 2.0.
When not to use: Pure TTS or pure STT — the full multimodal stack is overkill. Use Kokoro or faster-whisper for single-purpose pipelines. Vision-only work: the lighter `MiniCPM-V-4.6` (1B) may be enough.

Runner notes

llama.cpp for CPU, vLLM for throughput. int4 fits ~9–10 GB VRAM. Built on Qwen3-8B (the 2.6 generation used Qwen2.5-7B). Sibling `MiniCPM-V-4.6` (April 2026, vision-only, 1B) is the lighter option when you don't need voice I/O. Prior `MiniCPM-o-2_6` (Jan 2025) still on HF if you need the older stack.

License: Apache 2.0
Released: February 3, 2026
Maker: OpenBMB
Model card: huggingface.co/openbmb/MiniCPM-o-4_5 →

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this→