What can Apple MacBook Air M3 16GB (refurb) run? 243 models fit & the real cost
243 of the 377 models in the Spanvero catalog fit Apple MacBook Air M3 16GB (refurb)'s 16 GB unified memory (at a sensible quant, 16k context). For each: run it locally ($0 compute + electricity), rent an equivalent GPU ($0 markup, as of 2026-07-02), or pay per-token via your own API key (as of 2026-06-29).
Three honest ways to run each model on Apple MacBook Air M3 16GB (refurb)
Run it locally: $0 in compute — you pay only electricity (~35 W under load on this Mac). Local is real money, never a fake "$0".
Rent an equivalent GPU: from a $0-markup vendor rate (as of 2026-07-02) — you rent on your own account and pay the vendor directly; we never resell compute.
Skip the box: run the same model through your own API key, paying per million tokens (prices as of 2026-06-29).
What fits Apple MacBook Air M3 16GB (refurb) (16 GB unified memory)
243 of the 377 notable models in the Spanvero catalog fit Apple MacBook Air M3 16GB (refurb) at a sensible quant (context capped at 16k for the estimate). Most capable first:
NVIDIA Nemotron 3 Nano 30B A3B NVFP4 (nvidia, 18.2B) — needs ~13 GB at Q4_K_M: run it locally for $0 compute + ~$0.1178/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.25/1M via your own API key (size estimate).
Qwen3 30B A3B NVFP4 (RedHatAI, 17.5B) — needs ~13 GB at Q4_K_M: run it locally for $0 compute + ~$0.1141/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.24/1M via your own API key (size estimate).
deepseek moe 16b base (deepseek-ai, 16.4B) — needs ~12 GB at Q4_K_M: run it locally for $0 compute + ~$0.1084/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.23/1M via your own API key (size estimate).
deepseek moe 16b chat (deepseek-ai, 16.4B) — needs ~12 GB at Q4_K_M: run it locally for $0 compute + ~$0.1084/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.23/1M via your own API key (size estimate).
LLaDA2.0 mini (inclusionAI, 16.3B) — needs ~12 GB at Q4_K_M: run it locally for $0 compute + ~$0.1078/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.23/1M via your own API key (size estimate).
DeepSeek-Coder-V2-Lite Instruct (DeepSeek, 15.7B) — needs ~11 GB at Q4_K_M: run it locally for $0 compute + ~$0.1047/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.23/1M via your own API key (size estimate).
Qwen3 30B A3B NVFP4 (nvidia, 15.6B) — needs ~12 GB at Q4_K_M: run it locally for $0 compute + ~$0.1041/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.22/1M via your own API key (size estimate).
Qwen2.5 Coder 14B Instruct (Qwen, 14.8B) — needs ~14 GB at Q4_K_M: run it locally for $0 compute + ~$0.0998/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.22/1M via your own API key (size estimate).
Qwen2.5 14B Instruct (Qwen, 14.8B) — needs ~14 GB at Q4_K_M: run it locally for $0 compute + ~$0.0998/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.22/1M via your own API key (size estimate).
Qwen3 14B (Qwen, 14.8B) — needs ~13 GB at Q4_K_M: run it locally for $0 compute + ~$0.0998/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.17/1M via your own API key.
Qwen2.5 14B Instruct (unsloth, 14.8B) — needs ~14 GB at Q4_K_M: run it locally for $0 compute + ~$0.0998/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.22/1M via your own API key (size estimate).
phi 4 quantized.w4a16 (RedHatAI, 14.8B) — needs ~14 GB at Q4_K_M: run it locally for $0 compute + ~$0.0998/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.22/1M via your own API key (size estimate).
Qwen3 14B Base (Qwen, 14.8B) — needs ~13 GB at Q4_K_M: run it locally for $0 compute + ~$0.0998/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.22/1M via your own API key (size estimate).
HyperCLOVAX SEED Think 14B (naver-hyperclovax, 14.7B) — needs ~13 GB at Q4_K_M: run it locally for $0 compute + ~$0.0993/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.22/1M via your own API key (size estimate).
Qwen1.5 MoE A2.7B (Qwen, 14.3B) — needs ~12 GB at Q4_K_M: run it locally for $0 compute + ~$0.0971/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.21/1M via your own API key (size estimate).
Phi-4 (Microsoft, 14B) — needs ~13 GB at Q4_K_M: run it locally for $0 compute + ~$0.0955/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.11/1M via your own API key.
HarmBench Llama 2 13b cls (cais, 13B) — needs ~11 GB at Q4_K_M: run it locally for $0 compute + ~$0.09/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.20/1M via your own API key (size estimate).
Mellum2 12B A2.5B Thinking (JetBrains, 12.1B) — needs ~9 GB at Q4_K_M: run it locally for $0 compute + ~$0.085/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.20/1M via your own API key (size estimate).
Mellum2 12B A2.5B Base (JetBrains, 12.1B) — needs ~9 GB at Q4_K_M: run it locally for $0 compute + ~$0.085/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.20/1M via your own API key (size estimate).
Gemma 3 12B (Google, 12B) — needs ~13 GB at Q4_K_M: run it locally for $0 compute + ~$0.0844/1M in electricity, rent NVIDIA RTX 3090 24GB from $0.26/hr ($0 markup), or ~$0.10/1M via your own API key.
The honest cost of owning Apple MacBook Air M3 16GB (refurb)
Street price $1,020.00 (as of 2026-07-02; RefurbMe cross-retailer refurb listings, retrieved 2026-07-02 (15" M3 Air 16GB $1,018 / 13" $1,049, "Good" grade)) — amortized over 3 years that's ~$0.9315/day whether or not you're generating.
Electricity: ~35 W under sustained inference at $0.1883/kWh (EIA Electric Power Monthly Table 5.6.A — U.S. residential average, Apr 2026, as of 2026-07-02) — the per-1M-token figures above already include this at each model's speed.
Straight talk: for the small models a 16 GB unified memory box runs, hosted APIs are often cheaper per token. Own local for privacy, offline use, and unlimited runs — not to save money on tokens.
Too big for Apple MacBook Air M3 16GB (refurb) — rent or use an API instead
These need more than the 16 GB unified memory on this Mac. Closest first — you can still run them on a rented GPU ($0 markup) or via your own API key:
gpt-oss-20b (OpenAI, 21B) — needs ~15 GB; rent NVIDIA RTX 3090 24GB from $0.26/hr, or ~$0.08/1M via your own API key.
gpt oss 20b BF16 (unsloth, 20.9B) — needs ~15 GB; rent NVIDIA RTX 3090 24GB from $0.26/hr, or ~$0.27/1M via your own API key (size estimate).
Qwen3 32B NVFP4 (nvidia, 17.2B) — needs ~15 GB; rent NVIDIA RTX 3090 24GB from $0.26/hr, or ~$0.24/1M via your own API key (size estimate).
DeepSeek V2 Lite Chat (deepseek-ai, 15.7B) — needs ~15 GB; rent NVIDIA RTX 3090 24GB from $0.26/hr, or ~$0.23/1M via your own API key (size estimate).
DeepSeek V2 Lite (deepseek-ai, 15.7B) — needs ~15 GB; rent NVIDIA RTX 3090 24GB from $0.26/hr, or ~$0.23/1M via your own API key (size estimate).
NVIDIA Nemotron Nano 12B v2 (nvidia, 12.3B) — needs ~15 GB; rent NVIDIA RTX 3090 24GB from $0.26/hr, or ~$0.20/1M via your own API key (size estimate).
A short email of real AI price moves, straight from the daily log — no hype. We're collecting the list now; the first issue goes out when it opens. Unsubscribe with one click.