Home › Best › best LLM for 24GB VRAM The best open LLMs you can run on 24 GB of VRAM Open LLMs that fit in 24 GB of VRAM at their default quant — the RTX 3090 / 4090 / 7900 XTX tier where serious local models like 32B-class checkpoints become runnable. Ranked by the largest model that fits, with honest $0-local and rent-a-GPU costs. We guarantee the fit; you judge the quality.
How this is ranked: Objective fit filter only. 'Best' = 'runs on a 24 GB card.' VRAM is engine-computed; ordering by size, never a quality verdict.
1. Laguna XS.2 — poolside, 33.4B · ~24 GB VRAM · $0.37/1M API est. · commercial OK2. sarvam 30b — sarvamai, 32.2B · ~22 GB VRAM · $0.36/1M API est. · commercial OK3. llm jp 4 32b a3b thinking — llm-jp, 32.1B · ~23 GB VRAM · $0.36/1M API est. · commercial OK4. NVIDIA Nemotron 3 Nano 30B A3B BF16 — nvidia, 31.6B · ~22 GB VRAM · $0.35/1M API est. · non-commercial5. Nemotron Cascade 2 30B A3B — nvidia, 31.6B · ~22 GB VRAM · $0.35/1M API est. · non-commercial6. Qwen3 30B A3B — Qwen, 30.5B · ~22 GB VRAM · $0.34/1M API est. · commercial OK7. Qwen3 Coder 30B A3B Instruct — Qwen, 30.5B · ~22 GB VRAM · $0.34/1M API est. · commercial OK8. Qwen3 30B A3B Instruct 2507 — Qwen, 30.5B · ~22 GB VRAM · $0.34/1M API est. · commercial OK9. Qwen3 30B A3B abliterated — mlabonne, 30.5B · ~22 GB VRAM · $0.34/1M API est. · commercial OK10. Qwen3 30B A3B Thinking 2507 — Qwen, 30.5B · ~22 GB VRAM · $0.34/1M API est. · commercial OK11. lynx instruct 30b — bineric, 30.5B · ~22 GB VRAM · $0.34/1M API est. · commercial OK12. North Mini Code 1.0 — CohereLabs, 30.5B · ~22 GB VRAM · $0.34/1M API est. · commercial OK13. Gemma 2 27B Instruct — Google, 27B · ~22 GB VRAM · $0.65/1M API · commercial OK14. Qwen3.6 27B OBLITERATED — OBLITERATUS, 26.9B · ~22 GB VRAM · $0.32/1M API est. · commercial OK15. Trinity Mini — arcee-ai, 26.1B · ~19 GB VRAM · $0.31/1M API est. · non-commercial16. LFM2 24B A2B — LiquidAI, 23.8B · ~18 GB VRAM · $0.29/1M API est. · non-commercial17. Mistral Small 3 (24B, 2501) — Mistral AI, 23.6B · ~20 GB VRAM · $0.29/1M API est. · commercial OK18. EuroLLM 22B Instruct 2512 — utter-project, 22.6B · ~19 GB VRAM · $0.28/1M API est. · commercial OK19. gpt oss safeguard 20b — openai, 21.5B · ~16 GB VRAM · $0.27/1M API est. · commercial OK20. gpt-oss-20b — OpenAI, 21B · ~15 GB VRAM · $0.09/1M API · commercial OK21. gpt oss 20b BF16 — unsloth, 20.9B · ~15 GB VRAM · $0.27/1M API est. · commercial OK22. gpt neox 20b — EleutherAI, 20.7B · ~17 GB VRAM · $0.27/1M API est. · commercial OK23. Qwen3.6 27B Claude Opus Sonnet Distilled NVFP4 MTP — Brian6145, 19.6B · ~23 GB VRAM · $0.26/1M API est. · commercial OK24. Qwen3.6 27B AEON Ultimate Uncensored NVFP4 — AEON-7, 19.1B · ~23 GB VRAM · $0.25/1M API est. · commercial OK25. Qwen3.6 35B A3B NVFP4 — nvidia, 18.7B · ~22 GB VRAM · $0.25/1M API est. · commercial OK26. GLM 4.7 Flash NVFP4 — GadflyII, 18.4B · ~19 GB VRAM · $0.25/1M API est. · commercial OK27. NVIDIA Nemotron 3 Nano 30B A3B NVFP4 — nvidia, 18.2B · ~13 GB VRAM · $0.25/1M API est. · non-commercial28. Qwen3 30B A3B NVFP4 — RedHatAI, 17.5B · ~13 GB VRAM · $0.24/1M API est. · commercial OK29. Qwen3 32B NVFP4 — nvidia, 17.2B · ~15 GB VRAM · $0.24/1M API est. · commercial OK30. Param2 17B A2.4B Thinking — bharatgenai, 17.2B · ~12 GB VRAM · $0.24/1M API est. · non-commercial31. Huihui Qwen3.6 27B abliterated NVFP4 MTP — sakamakismile, 17.1B · ~20 GB VRAM · $0.24/1M API est. · commercial OK32. Qwen3.6 27B AEON Ultimate Uncensored Multimodal NVFP4 MTP XS — AEON-7, 17.1B · ~20 GB VRAM · $0.24/1M API est. · commercial OK33. Qwen3.6 27B Text NVFP4 MTP — sakamakismile, 16.7B · ~20 GB VRAM · $0.23/1M API est. · commercial OK34. LLaDA2.0 mini — inclusionAI, 16.3B · ~12 GB VRAM · $0.23/1M API est. · commercial OK35. starcoder — bigcode, 15.8B · ~19 GB VRAM · $0.23/1M API est. · commercial OK36. DeepSeek-Coder-V2-Lite Instruct — DeepSeek, 15.7B · ~11 GB VRAM · $0.23/1M API est. · commercial OK37. DeepSeek V2 Lite Chat — deepseek-ai, 15.7B · ~15 GB VRAM · $0.23/1M API est. · non-commercial38. DeepSeek V2 Lite — deepseek-ai, 15.7B · ~15 GB VRAM · $0.23/1M API est. · non-commercial39. Gemma 4 26B A4B it NVFP4 — bg-digitalservices, 15.1B · ~18 GB VRAM · $0.22/1M API est. · commercial OK40. Qwen2.5 Coder 14B Instruct — Qwen, 14.8B · ~14 GB VRAM · $0.22/1M API est. · commercial OKShowing the top 40 of 267. See all →
More: all "best" lists · cost calculator · all models
Open the free Spanvero advisor → · Honest, $0-markup. © 2026 Cynosure LLC.