Use Ollama — unless you want a GUI or a fully copyleft stack.

Three runners compete for the Mac + Linux + Windows local-AI desktop in 2026: Ollama, LM Studio, and Jan. The short answer is usually Ollama; the interesting cases are when it isn’t.

All three support llama.cpp under the hood and converge on the same GGUF model library. The differences that matter are UX, licensing, update cadence, and Apple Silicon performance specifically.

The verdict, by case

CLI-first workflow, scripting, agents, deployment. Ollama. Nothing else comes close for ollama run qwen3:30b plus an OpenAI-compatible API at localhost:11434.
You want a GUI and you’re on a Mac. LM Studio. Model browser is better, chat UI is polished, MLX auto-selection works out of the box.
You want a GUI and you need fully open-source. Jan. Apache-2.0 license, 43k GitHub stars, active — but the model ecosystem is thinner than LM Studio’s.
Privacy-critical enterprise deployment. Jan. Apache-2.0 is about as friction-free as licensing gets; LM Studio’s commercial terms are less clear.
Windows, no GUI preference. Ollama. Winget-installable, fewest surprises.

Ollama — the default

Ollama (ollama.com) is the default recommendation because it does one thing well: run a model, expose an OpenAI-compatible API, move on. Install in one command, pull a tagged model (ollama pull qwen3:30b), serve. The CLI is legible; the model library is current; quantization selection is automatic.

July 2026 update: Ollama’s MLX backend preview shipped in 0.19 (March 30, 2026); the current release is 0.32.1 (July 16, 2026), and 0.30 added llama.cpp alongside the MLX engine to widen hardware support. On M5 Max machines MLX is a real speedup — prefill ~1.6× faster, decode nearly 2× — but Ollama’s own guidance still asks for “more than 32GB of unified memory”. Below that floor it quietly falls back to llama.cpp/Metal — no error, no warning — so Mac mini M4 16 GB and 24 GB Macs don’t get the benefit and won’t be told. Worth knowing before you benchmark and wonder why the numbers look ordinary.

Watchout: Ollama ROCm support (AMD) is still patchy through 2026. If you’re on AMD, switch to llama.cpp directly or vLLM — see the AMD ROCm guide.

LM Studio — the Mac GUI pick

LM Studio (lmstudio.ai) is a desktop app first, runner second. For a non-technical user on a Mac, it’s the honest recommendation — the model browser surfaces quantizations cleanly, the chat UI handles multi-turn conversation + context pinning better than Ollama + a separate chat client, and MLX backend selection on Apple Silicon just works.

Watchout: LM Studio is source-available but not strictly open-source; commercial licensing has gotten clearer over 2025 but if your company policy requires a recognised open-source licence, Jan is the safer pick — it is Apache-2.0.

Jan — the open-source pick

Jan (jan.ai) is Apache-2.0, runs 100% offline by default, supports both llama.cpp and MLX backends. Version 0.8.3 (June 24, 2026), 43k+ GitHub stars, still shipping — this is a serious project now, not a hobby.

Correction (July 2026): earlier versions of this guide called Jan AGPLv3. That was wrong, and wrong in a way that inverted the advice — Jan relicensed from AGPL to Apache-2.0 in May 2025, before this guide was first published. If you skipped Jan because your legal team blocks copyleft, that reason never applied. Apologies; the licence above is now verified against the repository’s LICENSE file.

The reason to pick Jan isn’t feature parity with LM Studio (it’s close but still slightly behind on model browser polish); it’s licensing and privacy posture. Apache-2.0 clears legal review almost anywhere; fully on-device by default appeases the IT team. If either of those matters, Jan.

What nobody should use

Raw llama.cpp without a wrapper unless you’re specifically tuning quantization or Flash Attention settings — the three runners above all wrap it better. Text Generation WebUI (oobabooga) — still maintained but the modern alternatives have eaten most of its use cases with less setup friction. Hugging Face Transformers for inference — fine for experimentation, not a desktop runner.

Next step

Use the planner to pick a model for your runner→