Spanvero How it works Find a model Compare models Pricing

AI models that run on 48 GB of VRAM

431 open models fit in 48 GB of VRAM at their default quant — enough for a workstation card like an RTX A6000 or two 24 GB GPUs. Most capable first; run any of them for $0 on your own hardware.

StableBeluga2 — 69B, petals-team · ~48 GB VRAM
Laguna S 2.1 NVFP4 — 67.9B, poolside · ~48 GB VRAM
NVIDIA Nemotron 3 Super 120B A12B NVFP4 — 67.2B, nvidia · ~47 GB VRAM
Kimi Linear 48B A3B Instruct — 49.1B, moonshotai · ~36 GB VRAM
Mixtral 8x7B Instruct — 46.7B, Mistral AI · ~34 GB VRAM
NVIDIA Nemotron Labs 3 Puzzle 75B A9B NVFP4 — 44.5B, nvidia · ~33 GB VRAM
Phi 3.5 MoE instruct — 41.9B, microsoft · ~31 GB VRAM
Karnak 40B v1.0 — 40.7B, Applied-Innovation-Center · ~29 GB VRAM
IQuest Coder V1 40B Instruct — 39.8B, IQuestLab · ~33 GB VRAM
Seed OSS 36B Instruct — 36.2B, ByteDance-Seed · ~27 GB VRAM
Hermes 4.3 36B — 36.2B, NousResearch · ~27 GB VRAM
Qwen3.6 35B A3B Claude 4.7 Opus Reasoning Distilled — 36B, lordx64 · ~42 GB VRAM
Qwopus3.6 35B A3B Coder — 36B, Jackrong · ~42 GB VRAM
Huihui Qwen3.6 35B A3B Claude 4.7 Opus abliterated — 36B, huihui-ai · ~42 GB VRAM
Apodex 1.0 mini — 36B, apodex · ~42 GB VRAM
Agents A1 — 35.1B, InternScience · ~41 GB VRAM
Nex N2 mini — 35.1B, nex-agi · ~41 GB VRAM
Qwen3.6 35B A3B abliterated v4 — 34.7B, Bahushruth · ~25 GB VRAM
Qwen AgentWorld 35B A3B — 34.7B, Qwen · ~40 GB VRAM
Qwen3.6 35B A3B DSV4Pro Thinking Distill — 34.7B, nerkyor · ~40 GB VRAM
Ornith 1.0 35B MTPLX — 34.7B, wang-yang · ~40 GB VRAM
Yi-1.5-34B-Chat — 34.4B, 01.AI · ~25 GB VRAM
dolphin 2.9.1 yi 1.5 34b — 34.4B, dphn · ~26 GB VRAM
Laguna XS.2 — 33.4B, poolside · ~24 GB VRAM
Laguna XS 2.1 — 33.4B, poolside · ~24 GB VRAM
HyperCLOVAX SEED Think 32B — 33.3B, naver-hyperclovax · ~39 GB VRAM
Qwen3-32B — 32.8B, Alibaba · ~25 GB VRAM
Qwen2.5 32B Instruct — 32.8B, Qwen · ~27 GB VRAM
cogito v1 preview qwen 32B — 32.8B, deepcogito · ~27 GB VRAM
QwQ 32B — 32.8B, Qwen · ~27 GB VRAM
DeepSeek-R1-Distill-Qwen-32B — 32.5B, DeepSeek · ~27 GB VRAM
gemma 4 31B it NVFP4 turbo — 32.5B, LilaRest · ~38 GB VRAM
granite 4.0 h small — 32.2B, ibm-granite · ~25 GB VRAM
Olmo 3 1125 32B — 32.2B, allenai · ~26 GB VRAM
sarvam 30b — 32.2B, sarvamai · ~22 GB VRAM
OTel 2.0 LLM 31B IT — 32.1B, farbodtavakkoli · ~33 GB VRAM
llm jp 4 32b a3b thinking — 32.1B, llm-jp · ~23 GB VRAM
Qwen2.5-Coder 32B Instruct — 32B, Alibaba · ~25 GB VRAM
NVIDIA Nemotron 3 Nano 30B A3B BF16 — 31.6B, nvidia · ~22 GB VRAM
Nemotron 3 Nano 30B A3B — 31.6B, unsloth · ~37 GB VRAM
NVIDIA Nemotron 3 Nano 30B A3B Base BF16 — 31.6B, nvidia · ~37 GB VRAM
Nemotron Cascade 2 30B A3B — 31.6B, nvidia · ~22 GB VRAM
Gemma 4 Garnet V2 31B it ultra uncensored heretic — 31.3B, llmfan46 · ~37 GB VRAM
GLM 4.7 Flash — 31.2B, zai-org · ~28 GB VRAM
GLM 4.7 Flash — 31.2B, unsloth · ~28 GB VRAM
Qwen3 30B A3B — 30.5B, Qwen · ~22 GB VRAM
Qwen3 Coder 30B A3B Instruct — 30.5B, Qwen · ~22 GB VRAM
Qwen3 30B A3B Instruct 2507 — 30.5B, Qwen · ~22 GB VRAM
Qwen3 30B A3B abliterated — 30.5B, mlabonne · ~22 GB VRAM
Qwen3 30B A3B Thinking 2507 — 30.5B, Qwen · ~22 GB VRAM
Tongyi DeepResearch 30B A3B — 30.5B, Alibaba-NLP · ~22 GB VRAM
Qwen3 30B A3B Base — 30.5B, Qwen · ~22 GB VRAM
lynx instruct 30b — 30.5B, bineric · ~22 GB VRAM
North Mini Code 1.0 — 30.5B, CohereLabs · ~22 GB VRAM
granite 4.1 30b — 28.9B, ibm-granite · ~24 GB VRAM
droplychee 1.0 27b — 27.8B, droplychee · ~33 GB VRAM
Qwopus3.6 27B Coder — 27.8B, Jackrong · ~33 GB VRAM
Qwen3.6 27B AEON Ultimate Uncensored BF16 — 27.4B, AEON-7 · ~32 GB VRAM
Gemma 2 27B Instruct — 27B, Google · ~22 GB VRAM
Gemma 3 27B — 27B, Google · ~30 GB VRAM
medgemma 27b text it — 27B, google · ~32 GB VRAM
Qwen3.6 27B OBLITERATED — 26.9B, OBLITERATUS · ~22 GB VRAM
Qwen3.6 27B DSV4Pro Thinking Distill — 26.9B, nerkyor · ~31 GB VRAM
qwen27B Agent R2 abliterated preview — 26.9B, hotdogs · ~22 GB VRAM
Trinity Mini — 26.1B, arcee-ai · ~19 GB VRAM
gemma 4 26B A4B it uncensored — 25.8B, TrevorJS · ~30 GB VRAM
Dolphin Mistral 24B Venice Edition — 24B, dphn · ~28 GB VRAM
LFM2 24B A2B — 23.8B, LiquidAI · ~18 GB VRAM
Mistral Small 3 (24B, 2501) — 23.6B, Mistral AI · ~20 GB VRAM
EuroLLM 22B Instruct 2512 — 22.6B, utter-project · ~19 GB VRAM
solar pro preview instruct — 22.1B, upstage · ~17 GB VRAM
Ornith 1.0 35B PrismaAURA 4.75bit vllm MTP — 21.9B, rdtand · ~26 GB VRAM
gpt oss safeguard 20b — 21.5B, openai · ~16 GB VRAM
gpt-oss-20b — 21B, OpenAI · ~15 GB VRAM
Ornith 1.0 35B AEON Ultimate Uncensored NVFP4 — 21B, AEON-7 · ~25 GB VRAM
Gemma 4 31B IT NVFP4 — 20.9B, nvidia · ~25 GB VRAM
gpt oss 20b BF16 — 20.9B, unsloth · ~15 GB VRAM
gpt neox 20b — 20.7B, EleutherAI · ~17 GB VRAM
Qwen3.6 27B Claude Opus Sonnet Distilled NVFP4 MTP — 19.6B, Brian6145 · ~23 GB VRAM
Qwen3.6 27B AEON Ultimate Uncensored Multimodal NVFP4 MTP — 19.6B, AEON-7 · ~23 GB VRAM
Qwen3.6 27B AEON Ultimate Uncensored NVFP4 — 19.1B, AEON-7 · ~23 GB VRAM
Agents A1 NVFP4 — 18.9B, r0b0tlab · ~22 GB VRAM
Qwen3.6 35B A3B NVFP4 — 18.7B, nvidia · ~22 GB VRAM
GLM 4.7 Flash NVFP4 — 18.4B, GadflyII · ~19 GB VRAM
NVIDIA Nemotron 3 Nano 30B A3B NVFP4 — 18.2B, nvidia · ~13 GB VRAM
Qwen3.6 27B NVFP4 — 18.2B, nvidia · ~21 GB VRAM
Qwen3 30B A3B NVFP4 — 17.5B, RedHatAI · ~13 GB VRAM
Qwen3 32B NVFP4 — 17.2B, nvidia · ~15 GB VRAM
Param2 17B A2.4B Thinking — 17.2B, bharatgenai · ~12 GB VRAM
Qwen3.6 27B NVFP4 — 17.1B, ocicek · ~20 GB VRAM

Other GPU budgets

8 GB of VRAM · 16 GB of VRAM · 24 GB of VRAM · All models

The weekly price index

A short email of real AI price moves, straight from the daily log — no hype. We're collecting the list now; the first issue goes out when it opens. Unsubscribe with one click.

Joining the list needs JavaScript — or just email support@spanvero.com and we'll add you.