The best open LLMs you can run on 12 GB of VRAM

Open LLMs that fit in 12 GB of VRAM at their default quant — the sweet spot for an RTX 3060 12 GB, 4070, or 6700 XT. Ranked by the largest model that still fits, with the honest $0-local, rent-a-GPU, and your-own-API-key cost for each. We guarantee the fit; you judge which one you like best.

How this is ranked: Objective fit filter only (fills the gap between the 8 and 16 GB tiers). 'Best' means 'runs on a 12 GB card.' VRAM is engine-computed; ordering is by size, not a quality ranking we'd have to invent.

Showing the top 40 of 211. See all →

More: all "best" lists · cost calculator · all models

Open the free Spanvero advisor → · Honest, $0-markup. © 2026 Cynosure LLC.