Spanvero How it works Find a model Compare models Pricing

The cheapest open LLMs to run on your own hardware

Open LLMs ranked by how little VRAM they need to run locally — the smaller the footprint, the cheaper the GPU you need and the closer to truly $0 it gets. Sorted by computed VRAM-to-run (lowest first), with the honest local and rent-a-GPU cost for each.

How this is ranked: Objective: ranks by engine-computed VRAM (proxy for self-hosting cost — lowest VRAM = cheapest hardware to buy/rent). All runs are $0-markup. Not a quality ranking; the user judges which cheap-to-run model is good enough.

1. Nomic Embed Text v1.5 — Nomic AI, 137M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
2. gemma 3 270m — google, 300M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
3. Qwen2.5 0.5B Instruct — Qwen, 500M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
4. Qwen2.5 0.5B — Qwen, 500M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
5. TinyLlama 1.1B Chat v1.0 — TinyLlama, 1.1B · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
6. gpt2 large — openai-community, 800M · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
7. h2ovl mississippi 800m — h2oai, 800M · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
8. Qwen2 0.5B — Qwen, 500M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
9. Qwen2 0.5B Instruct — Qwen, 500M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
10. bloom 560m — bigscience, 600M · ~2.0 GB VRAM · $0.10/1M API est. · non-commercial
11. gpt2 medium — openai-community, 400M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
12. SmolLM2 360M Instruct — HuggingFaceTB, 400M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
13. LLaMmlein 1B prerelease — LSX-UniWue, 1.1B · ~2.0 GB VRAM · $0.11/1M API est. · non-commercial
14. t5gemma s s prefixlm — google, 300M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
15. bloomz 560m — bigscience, 600M · ~2.0 GB VRAM · $0.10/1M API est. · non-commercial
16. MiniCPM5 1B — openbmb, 1.1B · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
17. gemma 3 270m it — google, 300M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
18. pythia 410m — EleutherAI, 500M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
19. functiongemma 270m it — google, 300M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
20. vlt5 base keywords — Voicelab, 300M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
21. Falcon H1 0.5B Base — tiiuae, 500M · ~2.0 GB VRAM · $0.10/1M API est. · non-commercial
22. Qwen2.5 Coder 0.5B Instruct — Qwen, 500M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
23. pythia 1b — EleutherAI, 1.1B · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
24. MiniCPM4 0.5B — openbmb, 400M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
25. qwen sft countdown defaultproj — asingh15, 500M · ~2.0 GB VRAM · $0.10/1M API est. · non-commercial
26. Qwen3.6 35B A3B DFlash — z-lab, 500M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
27. pythia 410m deduped — EleutherAI, 500M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
28. Qwen3 8B DFlash b16 — z-lab, 1B · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
29. SmolLM2 360M — HuggingFaceTB, 400M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
30. LFM2.5 350M — LiquidAI, 400M · ~2.0 GB VRAM · $0.10/1M API est. · non-commercial
31. tinyllama oneshot w8w8 test static shape change — nm-testing, 1.1B · ~2.0 GB VRAM · $0.11/1M API est. · non-commercial
32. TinyLlama 1.1B intermediate step 1431k 3T — TinyLlama, 1.1B · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
33. gpt oss 20b speculator.eagle3 — RedHatAI, 900M · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
34. LFM2.5 350M Base — LiquidAI, 400M · ~2.0 GB VRAM · $0.10/1M API est. · non-commercial
35. gemma 4 26B A4B it DFlash — z-lab, 400M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
36. Qwen3.5 122B A10B DFlash — z-lab, 800M · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
37. Qwen3.5 4B DFlash — z-lab, 600M · ~2.0 GB VRAM · $0.10/1M API est. · commercial OK
38. MiniCPM5 1B SFT — openbmb, 1.1B · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
39. bitnet b1.58 2B 4T — microsoft, 800M · ~2.0 GB VRAM · $0.11/1M API est. · commercial OK
40. Llama 3.2 1B Instruct — Meta, 1.2B · ~3.0 GB VRAM · $0.11/1M API · commercial OK

Showing the top 40 of 551. See all →

More: all "best" lists · Outcome Lab · all models

The weekly price index

A short email of real AI price moves, straight from the daily log — no hype. We're collecting the list now; the first issue goes out when it opens. Unsubscribe with one click.

Joining the list needs JavaScript — or just email support@spanvero.com and we'll add you.