GUIDE · RUNNERS · JUNE 2026
Use Ollama — unless you want a GUI or a fully copyleft stack.
Three runners compete for the Mac + Linux + Windows local-AI desktop in 2026: Ollama, LM Studio, and Jan. The short answer is usually Ollama; the interesting cases are when it isn’t.
All three support llama.cpp under the hood and converge on the same GGUF model library. The differences that matter are UX, licensing, update cadence, and Apple Silicon performance specifically.
The verdict, by case
- CLI-first workflow, scripting, agents, deployment. Ollama. Nothing else comes close for
ollama run qwen3:30bplus an OpenAI-compatible API atlocalhost:11434. - You want a GUI and you’re on a Mac. LM Studio. Model browser is better, chat UI is polished, MLX auto-selection works out of the box.
- You want a GUI and you need fully open-source. Jan. AGPLv3 license, 41k GitHub stars, active — but the model ecosystem is thinner than LM Studio’s.
- Privacy-critical enterprise deployment. Jan. AGPLv3 compliance is legible; LM Studio’s commercial terms are less clear.
- Windows, no GUI preference. Ollama. Winget-installable, fewest surprises.
Ollama — the default
Ollama (ollama.com) is the default recommendation because it does one thing well: run a model, expose an OpenAI-compatible API, move on. Install in one command, pull a tagged model (ollama pull qwen3:30b), serve. The CLI is legible; the model library is current; quantization selection is automatic.
April 2026 update: Ollama’s MLX backend preview shipped in 0.19 (March 30, 2026) and matured through 0.20–0.21 by late April. On M5 Max machines, this is a real speedup — prefill ~1.6× faster, decode nearly 2× — but it requires 32 GB unified memory minimum. Mac mini M4 16 GB and 24 GB Macs don’t get the benefit yet; llama.cpp backend stays the default there.
Watchout: Ollama ROCm support (AMD) is still patchy through 2026. If you’re on AMD, switch to llama.cpp directly or vLLM — see the AMD ROCm guide.
LM Studio — the Mac GUI pick
LM Studio (lmstudio.ai) is a desktop app first, runner second. For a non-technical user on a Mac, it’s the honest recommendation — the model browser surfaces quantizations cleanly, the chat UI handles multi-turn conversation + context pinning better than Ollama + a separate chat client, and MLX backend selection on Apple Silicon just works.
Watchout: LM Studio is source-available but not strictly open-source; commercial licensing has gotten clearer over 2025 but if your company policy requires AGPL/GPL/Apache, Jan is the safer pick.
Jan — the open-source pick
Jan (jan.ai) is AGPLv3, runs 100% offline by default, supports both llama.cpp and MLX backends. Version 0.7.9 (March 23, 2026) brought 5.3M+ downloads and 41k+ GitHub stars — this is a serious project now, not a hobby.
The reason to pick Jan isn’t feature parity with LM Studio (it’s close but still slightly behind on model browser polish); it’s licensing and privacy posture. AGPLv3 is defensible in legal review; fully on-device by default appeases the IT team. If either of those matters, Jan.
What nobody should use
Raw llama.cpp without a wrapper unless you’re specifically tuning quantization or Flash Attention settings — the three runners above all wrap it better. Text Generation WebUI (oobabooga) — still maintained but the modern alternatives have eaten most of its use cases with less setup friction. Hugging Face Transformers for inference — fine for experimentation, not a desktop runner.
Next step
Use the planner to pick a model for your runner→