the AI bench
VERIFIED JUNE 2026
All models

MODEL · NOMIC · ~137M (MATRYOSHKA)

nomic-embed-text-v1.5

Nomic's Matryoshka Representation Learning embedding — truncate to 64/128/256/512/768 dims at query time for a tiny quality drop, giving drop-in cost/speed control BGE-M3 doesn't offer.

License: Apache 2.0 · Context: 8192 tokens · Released: February 2024

The decision in five lines

The call
Skip for local — for docs
Best for
docs
Runs on
23 hardware picks fit (cheapest: Intel Arc B580 12 GB · $249)
Watch out
Multilingual (v1.5 is English-tuned — use nomic-embed-text-v2-moe or BGE-M3).
Evidence
Estimated · last verified April 2026

~137M (Matryoshka)
PARAMETERS
EMBEDDING
TYPE
8192
CONTEXT
<1 GB
VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

DOCS · LOW
nomic-embed-text-v1.5 (retrieval)Lightweight English embedding; fast on CPU; pairs with any small generator.

The call

Nomic's Matryoshka Representation Learning embedding — truncate to 64/128/256/512/768 dims at query time for a tiny quality drop, giving drop-in cost/speed control BGE-M3 doesn't offer.

When not to use: Multilingual (v1.5 is English-tuned — use nomic-embed-text-v2-moe or BGE-M3). Also: skipping the required `search_document:` / `search_query:` prefixes silently tanks retrieval quality.

Runner notes

Ollama tag `nomic-embed-text` (defaults to v1.5). llama.cpp context defaults to 2048 — bump to 8192 manually for long docs. Prefixes are non-negotiable.

License
Apache 2.0
Released
February 2024
Maker
Nomic

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this