Baidu Unlimited-OCR

Baidu's open-weight OCR / document-understanding VLM, positioned as a step beyond DeepSeek-OCR — high-fidelity text + layout extraction from images and documents. MIT-licensed, with vLLM support (community-added), an arXiv paper (2606.23050), and a live Hugging Face Space demo. Strong early traction (400K+ downloads within days).

License: MIT · Context: Document-page scale · Released: June 22, 2026

The decision in five lines

The call: Skip for local
Best for: Local evaluation and family reference
Runs on: No planner hardware fits at default quant — see model card
Watch out: General chat or reasoning — this is a document/OCR specialist, not a generalist VLM.
Evidence: Editorial · last verified July 2026

Compact OCR vision-language model: PARAMETERS
OCR / DOCUMENT VISION-LANGUAGE MODEL: TYPE
Document-page: CONTEXT
Runs on a single consumer GPU (compact VLM): VRAM AT Q4

Where we recommend this

This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.

The call

Baidu's open-weight OCR / document-understanding VLM, positioned as a step beyond DeepSeek-OCR — high-fidelity text + layout extraction from images and documents. MIT-licensed, with vLLM support (community-added), an arXiv paper (2606.23050), and a live Hugging Face Space demo. Strong early traction (400K+ downloads within days).
When not to use: General chat or reasoning — this is a document/OCR specialist, not a generalist VLM. For broad vision-language work use MiniCPM-V-4.6 or MiniCPM-o 4.5.

Runner notes

vLLM inference supported (community PRs); also on ModelScope. GitHub `baidu/Unlimited-OCR` for the pipeline. MIT — clean for commercial document-processing use.

License: MIT
Released: June 22, 2026
Maker: Baidu
Model card: huggingface.co/baidu/Unlimited-OCR →

Next step

Find-by-model — see what hardware runs this→