What can Apple Mac mini M4 Pro 24GB run? 277 models fit & the real cost
277 of the 377 models in the Spanvero catalog fit Apple Mac mini M4 Pro 24GB's 24 GB unified memory (at a sensible quant, 16k context). For each: run it locally ($0 compute + electricity), rent an equivalent GPU ($0 markup, as of 2026-07-02), or pay per-token via your own API key (as of 2026-06-29).
Three honest ways to run each model on Apple Mac mini M4 Pro 24GB
Run it locally: $0 in compute — you pay only electricity (~140 W under load on this Mac). Local is real money, never a fake "$0".
Rent an equivalent GPU: from a $0-markup vendor rate (as of 2026-07-02) — you rent on your own account and pay the vendor directly; we never resell compute.
Skip the box: run the same model through your own API key, paying per million tokens (prices as of 2026-06-29).
What fits Apple Mac mini M4 Pro 24GB (24 GB unified memory)
277 of the 377 notable models in the Spanvero catalog fit Apple Mac mini M4 Pro 24GB at a sensible quant (context capped at 16k for the estimate). Most capable first:
sarvam 30b (sarvamai, 32.2B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.4648/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.36/1M via your own API key (size estimate).
NVIDIA Nemotron 3 Nano 30B A3B BF16 (nvidia, 31.6B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.4578/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.35/1M via your own API key (size estimate).
Nemotron Cascade 2 30B A3B (nvidia, 31.6B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.4578/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.35/1M via your own API key (size estimate).
Qwen3 30B A3B (Qwen, 30.5B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.445/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.31/1M via your own API key.
Qwen3 Coder 30B A3B Instruct (Qwen, 30.5B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.445/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.17/1M via your own API key.
Qwen3 30B A3B Instruct 2507 (Qwen, 30.5B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.445/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.12/1M via your own API key.
Qwen3 30B A3B Thinking 2507 (Qwen, 30.5B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.445/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.24/1M via your own API key.
Tongyi DeepResearch 30B A3B (Alibaba-NLP, 30.5B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.445/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.34/1M via your own API key (size estimate).
lynx instruct 30b (bineric, 30.5B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.445/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.34/1M via your own API key (size estimate).
North Mini Code 1.0 (CohereLabs, 30.5B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.445/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.34/1M via your own API key (size estimate).
Gemma 2 27B Instruct (Google, 27B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.4037/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.65/1M via your own API key.
Trinity Mini (arcee-ai, 26.1B) — needs ~19 GB at Q4_K_M: run it locally for $0 compute + ~$0.3929/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.10/1M via your own API key.
LFM2 24B A2B (LiquidAI, 23.8B) — needs ~18 GB at Q4_K_M: run it locally for $0 compute + ~$0.3649/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.29/1M via your own API key (size estimate).
Mistral Small 3 (24B, 2501) (Mistral AI, 23.6B) — needs ~20 GB at Q4_K_M: run it locally for $0 compute + ~$0.3625/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.07/1M via your own API key.
EuroLLM 22B Instruct 2512 (utter-project, 22.6B) — needs ~19 GB at Q4_K_M: run it locally for $0 compute + ~$0.3501/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.28/1M via your own API key (size estimate).
gpt oss safeguard 20b (openai, 21.5B) — needs ~16 GB at Q4_K_M: run it locally for $0 compute + ~$0.3364/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.19/1M via your own API key.
gpt-oss-20b (OpenAI, 21B) — needs ~15 GB at Q4_K_M: run it locally for $0 compute + ~$0.3302/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.08/1M via your own API key.
gpt oss 20b BF16 (unsloth, 20.9B) — needs ~15 GB at Q4_K_M: run it locally for $0 compute + ~$0.3289/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.27/1M via your own API key (size estimate).
gpt neox 20b (EleutherAI, 20.7B) — needs ~17 GB at Q4_K_M: run it locally for $0 compute + ~$0.3264/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.27/1M via your own API key (size estimate).
Qwen3.6 35B A3B NVFP4 (nvidia, 18.7B) — needs ~22 GB at Q4_K_M: run it locally for $0 compute + ~$0.3009/1M in electricity, rent NVIDIA RTX A6000 48GB from $0.49/hr ($0 markup), or ~$0.25/1M via your own API key (size estimate).
The honest cost of owning Apple Mac mini M4 Pro 24GB
Street price $1,399.00 (as of 2026-07-02; Apple MSRP, retrieved 2026-07-02 ($1,399 base M4 Pro Mac mini, 24GB unified memory)) — amortized over 3 years that's ~$1.2776/day whether or not you're generating.
Electricity: ~140 W under sustained inference at $0.1883/kWh (EIA Electric Power Monthly Table 5.6.A — U.S. residential average, Apr 2026, as of 2026-07-02) — the per-1M-token figures above already include this at each model's speed.
Straight talk: for the small models a 24 GB unified memory box runs, hosted APIs are often cheaper per token. Own local for privacy, offline use, and unlimited runs — not to save money on tokens.
Too big for Apple Mac mini M4 Pro 24GB — rent or use an API instead
These need more than the 24 GB unified memory on this Mac. Closest first — you can still run them on a rented GPU ($0 markup) or via your own API key:
llm jp 4 32b a3b thinking (llm-jp, 32.1B) — needs ~23 GB; rent NVIDIA RTX A6000 48GB from $0.49/hr, or ~$0.36/1M via your own API key (size estimate).
Qwen3.6 27B Claude Opus Sonnet Distilled NVFP4 MTP (Brian6145, 19.6B) — needs ~23 GB; rent NVIDIA RTX A6000 48GB from $0.49/hr, or ~$0.26/1M via your own API key (size estimate).
Laguna XS.2 (poolside, 33.4B) — needs ~24 GB; rent NVIDIA RTX A6000 48GB from $0.49/hr, or ~$0.15/1M via your own API key.
granite 4.1 30b (ibm-granite, 28.9B) — needs ~24 GB; rent NVIDIA RTX A6000 48GB from $0.49/hr, or ~$0.33/1M via your own API key (size estimate).
Yi-1.5-34B-Chat (01.AI, 34.4B) — needs ~25 GB; rent NVIDIA RTX A6000 48GB from $0.49/hr, or ~$0.38/1M via your own API key (size estimate).
Qwen3-32B (Alibaba, 32.8B) — needs ~25 GB; rent NVIDIA RTX A6000 48GB from $0.49/hr, or ~$0.18/1M via your own API key.
A short email of real AI price moves, straight from the daily log — no hype. We're collecting the list now; the first issue goes out when it opens. Unsubscribe with one click.