Home › Best › cheapest LLM to run The cheapest open LLMs to run via your own API key Open LLMs ranked by their real, dated $/1M-token price (blended input+output) on your own API key — cheapest first, $0 markup. This list only includes models we have an actual, verified price for; we never pad it with size-based guesses. The $0-on-your-own-hardware and rent-a-GPU options are shown on each model's page.
How this is ranked: Built only from real, dated prices in model-prices.ts — size-estimated models are EXCLUDED from this ranking entirely (so no fabricated figure can drive the order). 'Cheapest' is a measured dollar amount, not a quality claim; open the advisor for your exact workload.
1. Llama 3.1 8B Instruct — Meta, 8B · ~8.0 GB VRAM · $0.03/1M API · commercial OK2. Gemma 2 9B Instruct — Google, 9B · ~9.0 GB VRAM · $0.06/1M API · commercial OK3. Qwen2.5 7B Instruct — Alibaba, 7B · ~7.0 GB VRAM · $0.07/1M API · commercial OK4. gpt-oss-20b — OpenAI, 21B · ~15 GB VRAM · $0.09/1M API · commercial OK5. gpt-oss-120b — OpenAI, 117B · ~80 GB VRAM · $0.11/1M API · commercial OK6. Mistral 7B Instruct v0.3 — Mistral AI, 7.2B · ~8.0 GB VRAM · $0.20/1M API · commercial OK7. Llama 4 Scout (17B-16E) — Meta, 109B · ~77 GB VRAM · $0.20/1M API · commercial OK8. Mixtral 8x7B Instruct — Mistral AI, 46.7B · ~34 GB VRAM · $0.24/1M API · commercial OK9. DeepSeek V3.2 Exp — deepseek-ai, 685.4B · ~490 GB VRAM · $0.32/1M API · commercial OK10. Llama 4 Maverick (17B-128E) — Meta, 402B · ~274 GB VRAM · $0.38/1M API · commercial OK11. Llama 3.1 70B Instruct — Meta, 70B · ~53 GB VRAM · $0.40/1M API · commercial OK12. DeepSeek-V3 — DeepSeek, 671B · ~453 GB VRAM · $0.50/1M API · commercial OK13. Qwen3 Coder 480B A35B Instruct — Qwen, 480.2B · ~325 GB VRAM · $0.61/1M API · commercial OK14. MiniMax M2 — MiniMaxAI, 228.7B · ~156 GB VRAM · $0.63/1M API · non-commercial15. Gemma 2 27B Instruct — Google, 27B · ~22 GB VRAM · $0.65/1M API · commercial OK16. DeepSeek V4 Pro — deepseek-ai, 861.6B · ~580 GB VRAM · $0.66/1M API · commercial OK17. Llama 3.1 405B Instruct — Meta, 405B · ~290 GB VRAM · $0.80/1M API · commercial OK18. Qwen2.5-Coder 32B Instruct — Alibaba, 32B · ~25 GB VRAM · $0.83/1M API · commercial OK19. GLM 4.7 — zai-org, 358.3B · ~244 GB VRAM · $1.08/1M API · commercial OK20. GLM 4.6 — zai-org, 356.8B · ~243 GB VRAM · $1.09/1M API · commercial OK21. Qwen3-235B-A22B — Alibaba, 235B · ~160 GB VRAM · $1.14/1M API · commercial OK22. DeepSeek R1 0528 — deepseek-ai, 684.5B · ~489 GB VRAM · $1.33/1M API · commercial OK23. Kimi K2 Instruct 0905 — moonshotai, 1026.5B · ~719 GB VRAM · $1.55/1M API · non-commercial24. Kimi K2 Thinking — moonshotai, 1058.1B · ~740 GB VRAM · $1.55/1M API · non-commercial25. DeepSeek-R1 — DeepSeek, 671B · ~453 GB VRAM · $1.60/1M API · commercial OK26. GLM 5.1 — zai-org, 753.9B · ~539 GB VRAM · $2.03/1M API · commercial OKMore: all "best" lists · cost calculator · all models
Open the free Spanvero advisor → · Honest, $0-markup. © 2026 Cynosure LLC.