MODEL · IBM RESEARCH · 3B / 8B / 30B DENSE (INSTRUCT + BASE EACH)
IBM Granite 4.1
IBM's refreshed open-weights enterprise family — three dense decoder-only sizes, Apache 2.0, trained on ~15T tokens with progressive annealing toward technical/scientific/mathematical data plus instruction-following. The 8B instruct claims to match the prior Granite 4.0 32B-A9B MoE flagship on IBM's own benchmarks; cross-vendor comparison (vs Qwen/Gemma/Mistral) is unverified at time of publication.
License: Apache 2.0 · Context: 128K default; 512K via late-training context-extension stage · Released: April 29, 2026
- 3B
- PARAMETERS
- DENSE
- TYPE
- 128K
- CONTEXT
- ~2.1 GB (3B) / ~5.3 GB (8B) / ~17 GB (30B) — Ollama Q4_K_M
- VRAM AT Q4
Where we recommend this
This model isn’t currently in an active planner slot. See the runner notes below if you’re running it anyway.
The call
IBM's refreshed open-weights enterprise family — three dense decoder-only sizes, Apache 2.0, trained on ~15T tokens with progressive annealing toward technical/scientific/mathematical data plus instruction-following. The 8B instruct claims to match the prior Granite 4.0 32B-A9B MoE flagship on IBM's own benchmarks; cross-vendor comparison (vs Qwen/Gemma/Mistral) is unverified at time of publication.
When not to use: Frontier reasoning or coding-agent workflows where Qwen3-Coder-30B-A3B or GLM-5.1 are the established daily drivers. Granite 4.1 is positioned as enterprise-deploy-friendly (clean Apache 2.0, IBM support story, traceable training data) rather than benchmark-chasing.
Runner notes
Ollama tags live same-day: `granite4.1:3b`, `granite4.1:8b`, `granite4.1:30b`. Default tags ship at 128K context — 512K requires the extended-context training-stage variants from `huggingface.co/ibm-granite`. Q4_K_M is the Ollama default; Q8_0 available via `:8b-q8_0` etc. HuggingFace hosting at `ibm-granite/granite-4.1-*-instruct`.
Hardware that fits
Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.
- Intel Arc B580 12 GBPerfect · 3.2× 12 GB · $249–$299
- NVIDIA RTX 3060 12 GBPerfect · 3.2× 12 GB · $250–$340
- Minisforum UM890 ProPerfect · 6.4× 32 GB DDR5 (shared) · $463–$580 all-in
- Mac Mini M4 16 GBPerfect · 2.9× 16 GB unified · $499–$599
- RTX 5060 Ti 16 GBPerfect · 4.3× 16 GB · $550
- AMD Radeon RX 9070 XTPerfect · 4.3× 16 GB · $649–$849
- AMD Radeon RX 7900 XTXPerfect · 6.4× 24 GB · $770–$1,400
- NVIDIA RTX 3090 (used, single)Perfect · 6.4× 24 GB · $800–$1,000
- NVIDIA RTX 5070 TiPerfect · 4.3× 16 GB · $870–$1,200
- NVIDIA RTX 5080Perfect · 4.3× 16 GB · $999–$1,250
- NVIDIA RTX 4090Perfect · 6.4× 24 GB · $1,000–$2,500
- MacBook Air M5 24 GBPerfect · 4.3× 24 GB unified · $1,299–$1,699
- Mac Mini M4 Pro 24 GBPerfect · 4.3× 24 GB unified · $1,399
- Dual RTX 3090 (used)Perfect · 12.9× 48 GB · $1,800–$2,500 all-in
- Framework Desktop (Ryzen AI Max+ 395)Perfect · 23.0× 128 GB unified · $1,999–$2,851
- M5 Pro MacBook Pro 48 GBPerfect · 8.6× 48 GB unified · $2,599–$3,099
- NVIDIA RTX 5090Perfect · 8.6× 32 GB · $3,000–$3,900
- Mac Studio M4 Max 64 GBPerfect · 11.5× 64 GB unified · $3,199
- NVIDIA RTX A6000 (48 GB, used)Perfect · 12.9× 48 GB ECC · $3,500–$4,500
- Mac Studio M3 Ultra 96 GBPerfect · 17.3× 96 GB unified · $3,999
- M5 Max MacBook Pro 64 GBPerfect · 11.5× 64 GB unified · $4,499
- NVIDIA DGX SparkPerfect · 23.0× 128 GB unified · $4,699
- Dual RTX 5090Perfect · 17.2× 64 GB (2×32) · $8,500–$10,500
Next step
Find-by-model — see what hardware runs this→