Home › Models › 48 GB of VRAM AI models that run on 48 GB of VRAM 292 open models fit in 48 GB of VRAM at their default quant — enough for a workstation card like an RTX A6000 or two 24 GB GPUs. Most capable first; run any of them for $0 on your own hardware.
StableBeluga2 — 69B, petals-team · ~48 GB VRAMNVIDIA Nemotron 3 Super 120B A12B NVFP4 — 67.2B, nvidia · ~47 GB VRAMKimi Linear 48B A3B Instruct — 49.1B, moonshotai · ~36 GB VRAMMixtral 8x7B Instruct — 46.7B, Mistral AI · ~34 GB VRAMPhi 3.5 MoE instruct — 41.9B, microsoft · ~31 GB VRAMKarnak 40B v1.0 — 40.7B, Applied-Innovation-Center · ~29 GB VRAMSeed OSS 36B Instruct — 36.2B, ByteDance-Seed · ~27 GB VRAMHuihui Qwen3.6 35B A3B Claude 4.7 Opus abliterated — 36B, huihui-ai · ~42 GB VRAMYi-1.5-34B-Chat — 34.4B, 01.AI · ~25 GB VRAMdolphin 2.9.1 yi 1.5 34b — 34.4B, dphn · ~26 GB VRAMLaguna XS.2 — 33.4B, poolside · ~24 GB VRAMHyperCLOVAX SEED Think 32B — 33.3B, naver-hyperclovax · ~39 GB VRAMQwen3-32B — 32.8B, Alibaba · ~25 GB VRAMQwen2.5 32B Instruct — 32.8B, Qwen · ~27 GB VRAMcogito v1 preview qwen 32B — 32.8B, deepcogito · ~27 GB VRAMQwQ 32B — 32.8B, Qwen · ~27 GB VRAMDeepSeek-R1-Distill-Qwen-32B — 32.5B, DeepSeek · ~27 GB VRAMgemma 4 31B it NVFP4 turbo — 32.5B, LilaRest · ~38 GB VRAMgranite 4.0 h small — 32.2B, ibm-granite · ~25 GB VRAMsarvam 30b — 32.2B, sarvamai · ~22 GB VRAMllm jp 4 32b a3b thinking — 32.1B, llm-jp · ~23 GB VRAMQwen2.5-Coder 32B Instruct — 32B, Alibaba · ~25 GB VRAMNVIDIA Nemotron 3 Nano 30B A3B BF16 — 31.6B, nvidia · ~22 GB VRAMNemotron 3 Nano 30B A3B — 31.6B, unsloth · ~37 GB VRAMNemotron Cascade 2 30B A3B — 31.6B, nvidia · ~22 GB VRAMGLM 4.7 Flash — 31.2B, zai-org · ~28 GB VRAMGLM 4.7 Flash — 31.2B, unsloth · ~28 GB VRAMQwen3 30B A3B — 30.5B, Qwen · ~22 GB VRAMQwen3 Coder 30B A3B Instruct — 30.5B, Qwen · ~22 GB VRAMQwen3 30B A3B Instruct 2507 — 30.5B, Qwen · ~22 GB VRAMQwen3 30B A3B abliterated — 30.5B, mlabonne · ~22 GB VRAMQwen3 30B A3B Thinking 2507 — 30.5B, Qwen · ~22 GB VRAMlynx instruct 30b — 30.5B, bineric · ~22 GB VRAMNorth Mini Code 1.0 — 30.5B, CohereLabs · ~22 GB VRAMGemma 2 27B Instruct — 27B, Google · ~22 GB VRAMGemma 3 27B — 27B, Google · ~30 GB VRAMQwen3.6 27B OBLITERATED — 26.9B, OBLITERATUS · ~22 GB VRAMTrinity Mini — 26.1B, arcee-ai · ~19 GB VRAMgemma 4 26B A4B it uncensored — 25.8B, TrevorJS · ~30 GB VRAMLFM2 24B A2B — 23.8B, LiquidAI · ~18 GB VRAMMistral Small 3 (24B, 2501) — 23.6B, Mistral AI · ~20 GB VRAMEuroLLM 22B Instruct 2512 — 22.6B, utter-project · ~19 GB VRAMgpt oss safeguard 20b — 21.5B, openai · ~16 GB VRAMgpt-oss-20b — 21B, OpenAI · ~15 GB VRAMGemma 4 31B IT NVFP4 — 20.9B, nvidia · ~25 GB VRAMgpt oss 20b BF16 — 20.9B, unsloth · ~15 GB VRAMgpt neox 20b — 20.7B, EleutherAI · ~17 GB VRAMQwen3.6 27B Claude Opus Sonnet Distilled NVFP4 MTP — 19.6B, Brian6145 · ~23 GB VRAMQwen3.6 27B AEON Ultimate Uncensored NVFP4 — 19.1B, AEON-7 · ~23 GB VRAMQwen3.6 35B A3B NVFP4 — 18.7B, nvidia · ~22 GB VRAMGLM 4.7 Flash NVFP4 — 18.4B, GadflyII · ~19 GB VRAMNVIDIA Nemotron 3 Nano 30B A3B NVFP4 — 18.2B, nvidia · ~13 GB VRAMQwen3 30B A3B NVFP4 — 17.5B, RedHatAI · ~13 GB VRAMQwen3 32B NVFP4 — 17.2B, nvidia · ~15 GB VRAMParam2 17B A2.4B Thinking — 17.2B, bharatgenai · ~12 GB VRAMHuihui Qwen3.6 27B abliterated NVFP4 MTP — 17.1B, sakamakismile · ~20 GB VRAMQwen3.6 27B AEON Ultimate Uncensored Multimodal NVFP4 MTP XS — 17.1B, AEON-7 · ~20 GB VRAMQwen3.6 27B Text NVFP4 MTP — 16.7B, sakamakismile · ~20 GB VRAMLLaDA2.0 mini — 16.3B, inclusionAI · ~12 GB VRAMstarcoder — 15.8B, bigcode · ~19 GB VRAMDeepSeek-Coder-V2-Lite Instruct — 15.7B, DeepSeek · ~11 GB VRAMDeepSeek V2 Lite Chat — 15.7B, deepseek-ai · ~15 GB VRAMDeepSeek V2 Lite — 15.7B, deepseek-ai · ~15 GB VRAMGemma 4 26B A4B it NVFP4 — 15.1B, bg-digitalservices · ~18 GB VRAMQwen2.5 Coder 14B Instruct — 14.8B, Qwen · ~14 GB VRAMQwen2.5 14B Instruct — 14.8B, Qwen · ~14 GB VRAMQwen3 14B — 14.8B, Qwen · ~13 GB VRAMQwen3 14B Base — 14.8B, Qwen · ~13 GB VRAMQwen3 14B Instruct — 14.8B, OpenPipe · ~13 GB VRAMGemma 4 26B A4B NVFP4 — 14.4B, nvidia · ~17 GB VRAMdiffusiongemma 26B A4B it NVFP4 — 14.4B, nvidia · ~17 GB VRAMQwen1.5 MoE A2.7B — 14.3B, Qwen · ~12 GB VRAMPhi-4 — 14B, Microsoft · ~13 GB VRAMLlama 2 13b chat hf — 13B, meta-llama · ~16 GB VRAMHarmBench Llama 2 13b cls — 13B, cais · ~11 GB VRAMNVIDIA Nemotron Nano 12B v2 — 12.3B, nvidia · ~15 GB VRAMMN 12B Mag Mell R1 — 12.2B, inflatebot · ~12 GB VRAMGemma 3 12B — 12B, Google · ~13 GB VRAMGemma 4 12B OBLITERATED — 12B, OBLITERATUS · ~14 GB VRAMApertus 70B Instruct 2509 quantized.w4a16 — 11.3B, RedHatAI · ~14 GB VRAMFalcon3-10B Instruct — 10B, TII · ~10 GB VRAMDarwin 9B NEG — 9.7B, ansulev · ~12 GB VRAMSeeClick — 9.7B, cckevinn · ~12 GB VRAMgemma 2 9b — 9.2B, google · ~11 GB VRAMGemma 2 9B Instruct — 9B, Google · ~9.0 GB VRAMNVIDIA Nemotron Nano 9B v2 — 8.9B, nvidia · ~10 GB VRAMNVIDIA Nemotron Nano 9B v2 Japanese — 8.9B, nvidia · ~10 GB VRAMinternlm3 8b instruct — 8.8B, internlm · ~8.0 GB VRAMNemotron Labs Diffusion 8B Base — 8.5B, nvidia · ~7.0 GB VRAMLFM2.5 8B A1B — 8.5B, LiquidAI · ~7.0 GB VRAMOther GPU budgets 8 GB of VRAM · 16 GB of VRAM · 24 GB of VRAM · All models
Open the free advisor → · Prices as of 2026-06-17. We're an honest advisor — $0 markup, your own accounts, we never resell compute. © 2026 Cynosure LLC.