Home › Models › 24 GB of VRAM AI models that run on 24 GB of VRAM 267 open models fit in 24 GB of VRAM at their default quant — enough for a high-end card like an RTX 3090 / 4090. Most capable first; run any of them for $0 on your own hardware.
Laguna XS.2 — 33.4B, poolside · ~24 GB VRAMsarvam 30b — 32.2B, sarvamai · ~22 GB VRAMllm jp 4 32b a3b thinking — 32.1B, llm-jp · ~23 GB VRAMNVIDIA Nemotron 3 Nano 30B A3B BF16 — 31.6B, nvidia · ~22 GB VRAMNemotron Cascade 2 30B A3B — 31.6B, nvidia · ~22 GB VRAMQwen3 30B A3B — 30.5B, Qwen · ~22 GB VRAMQwen3 Coder 30B A3B Instruct — 30.5B, Qwen · ~22 GB VRAMQwen3 30B A3B Instruct 2507 — 30.5B, Qwen · ~22 GB VRAMQwen3 30B A3B abliterated — 30.5B, mlabonne · ~22 GB VRAMQwen3 30B A3B Thinking 2507 — 30.5B, Qwen · ~22 GB VRAMlynx instruct 30b — 30.5B, bineric · ~22 GB VRAMNorth Mini Code 1.0 — 30.5B, CohereLabs · ~22 GB VRAMGemma 2 27B Instruct — 27B, Google · ~22 GB VRAMQwen3.6 27B OBLITERATED — 26.9B, OBLITERATUS · ~22 GB VRAMTrinity Mini — 26.1B, arcee-ai · ~19 GB VRAMLFM2 24B A2B — 23.8B, LiquidAI · ~18 GB VRAMMistral Small 3 (24B, 2501) — 23.6B, Mistral AI · ~20 GB VRAMEuroLLM 22B Instruct 2512 — 22.6B, utter-project · ~19 GB VRAMgpt oss safeguard 20b — 21.5B, openai · ~16 GB VRAMgpt-oss-20b — 21B, OpenAI · ~15 GB VRAMgpt oss 20b BF16 — 20.9B, unsloth · ~15 GB VRAMgpt neox 20b — 20.7B, EleutherAI · ~17 GB VRAMQwen3.6 27B Claude Opus Sonnet Distilled NVFP4 MTP — 19.6B, Brian6145 · ~23 GB VRAMQwen3.6 27B AEON Ultimate Uncensored NVFP4 — 19.1B, AEON-7 · ~23 GB VRAMQwen3.6 35B A3B NVFP4 — 18.7B, nvidia · ~22 GB VRAMGLM 4.7 Flash NVFP4 — 18.4B, GadflyII · ~19 GB VRAMNVIDIA Nemotron 3 Nano 30B A3B NVFP4 — 18.2B, nvidia · ~13 GB VRAMQwen3 30B A3B NVFP4 — 17.5B, RedHatAI · ~13 GB VRAMQwen3 32B NVFP4 — 17.2B, nvidia · ~15 GB VRAMParam2 17B A2.4B Thinking — 17.2B, bharatgenai · ~12 GB VRAMHuihui Qwen3.6 27B abliterated NVFP4 MTP — 17.1B, sakamakismile · ~20 GB VRAMQwen3.6 27B AEON Ultimate Uncensored Multimodal NVFP4 MTP XS — 17.1B, AEON-7 · ~20 GB VRAMQwen3.6 27B Text NVFP4 MTP — 16.7B, sakamakismile · ~20 GB VRAMLLaDA2.0 mini — 16.3B, inclusionAI · ~12 GB VRAMstarcoder — 15.8B, bigcode · ~19 GB VRAMDeepSeek-Coder-V2-Lite Instruct — 15.7B, DeepSeek · ~11 GB VRAMDeepSeek V2 Lite Chat — 15.7B, deepseek-ai · ~15 GB VRAMDeepSeek V2 Lite — 15.7B, deepseek-ai · ~15 GB VRAMGemma 4 26B A4B it NVFP4 — 15.1B, bg-digitalservices · ~18 GB VRAMQwen2.5 Coder 14B Instruct — 14.8B, Qwen · ~14 GB VRAMQwen2.5 14B Instruct — 14.8B, Qwen · ~14 GB VRAMQwen3 14B — 14.8B, Qwen · ~13 GB VRAMQwen3 14B Base — 14.8B, Qwen · ~13 GB VRAMQwen3 14B Instruct — 14.8B, OpenPipe · ~13 GB VRAMGemma 4 26B A4B NVFP4 — 14.4B, nvidia · ~17 GB VRAMdiffusiongemma 26B A4B it NVFP4 — 14.4B, nvidia · ~17 GB VRAMQwen1.5 MoE A2.7B — 14.3B, Qwen · ~12 GB VRAMPhi-4 — 14B, Microsoft · ~13 GB VRAMLlama 2 13b chat hf — 13B, meta-llama · ~16 GB VRAMHarmBench Llama 2 13b cls — 13B, cais · ~11 GB VRAMNVIDIA Nemotron Nano 12B v2 — 12.3B, nvidia · ~15 GB VRAMMN 12B Mag Mell R1 — 12.2B, inflatebot · ~12 GB VRAMGemma 3 12B — 12B, Google · ~13 GB VRAMGemma 4 12B OBLITERATED — 12B, OBLITERATUS · ~14 GB VRAMApertus 70B Instruct 2509 quantized.w4a16 — 11.3B, RedHatAI · ~14 GB VRAMFalcon3-10B Instruct — 10B, TII · ~10 GB VRAMDarwin 9B NEG — 9.7B, ansulev · ~12 GB VRAMSeeClick — 9.7B, cckevinn · ~12 GB VRAMgemma 2 9b — 9.2B, google · ~11 GB VRAMGemma 2 9B Instruct — 9B, Google · ~9.0 GB VRAMNVIDIA Nemotron Nano 9B v2 — 8.9B, nvidia · ~10 GB VRAMNVIDIA Nemotron Nano 9B v2 Japanese — 8.9B, nvidia · ~10 GB VRAMinternlm3 8b instruct — 8.8B, internlm · ~8.0 GB VRAMNemotron Labs Diffusion 8B Base — 8.5B, nvidia · ~7.0 GB VRAMLFM2.5 8B A1B — 8.5B, LiquidAI · ~7.0 GB VRAMgemma 7b — 8.5B, google · ~11 GB VRAMQwen3-8B — 8.2B, Alibaba · ~9.0 GB VRAMQwen3 8B Base — 8.2B, Qwen · ~9.0 GB VRAMDeepSeek R1 0528 Qwen3 8B — 8.2B, deepseek-ai · ~9.0 GB VRAMQwen3 14B NVFP4 — 8.2B, nvidia · ~9.0 GB VRAMgranite 3.1 8b instruct — 8.2B, ibm-granite · ~9.0 GB VRAMQwen3Guard Gen 8B — 8.2B, Qwen · ~9.0 GB VRAMgranite 3.0 8b instruct — 8.2B, ibm-granite · ~7.0 GB VRAMgranite 3.3 8b instruct — 8.2B, ibm-granite · ~9.0 GB VRAMApertus 8B Instruct 2509 — 8.1B, swiss-ai · ~9.0 GB VRAMLlama 3.1 8B Instruct — 8B, Meta · ~8.0 GB VRAMQwen2-VL 7B Instruct — 8B, Alibaba · ~7.0 GB VRAMLlama 3.1 8B Instruct (Abliterated) — 8B, mlabonne (community) · ~8.0 GB VRAMHermes 3 — Llama 3.1 8B — 8B, Nous Research · ~8.0 GB VRAMDolphin 3.0 — Llama 3.1 8B — 8B, Cognitive Computations · ~8.0 GB VRAMMeta Llama 3 8B Instruct — 8B, meta-llama · ~10 GB VRAMLlama 3.1 8B — 8B, meta-llama · ~10 GB VRAMMeta Llama 3 8B — 8B, meta-llama · ~10 GB VRAMLLaDA 8B Instruct — 8B, GSAI-ML · ~8.0 GB VRAMDeepSeek R1 Distill Llama 8B — 8B, deepseek-ai · ~8.0 GB VRAMsaiga llama3 8b — 8B, IlyaGusev · ~7.0 GB VRAMMeta Llama 3.1 8B Instruct — 8B, unsloth · ~8.0 GB VRAMgemma 4 E4B it OBLITERATED — 8B, OBLITERATUS · ~10 GB VRAMLLaDA 1.5 — 8B, GSAI-ML · ~8.0 GB VRAMMeta Llama 3 8B Instruct — 8B, NousResearch · ~7.0 GB VRAMOther GPU budgets 8 GB of VRAM · 16 GB of VRAM · 48 GB of VRAM · All models
Open the free advisor → · Prices as of 2026-06-17. We're an honest advisor — $0 markup, your own accounts, we never resell compute. © 2026 Cynosure LLC.