Spanvero How it works Find a model Compare models Pricing

All AI models

Browse 527 open models by size, by publisher, or by the GPU you have. Every one shows the genuinely cheapest way to run it — free on your machine, a rented GPU at the vendor's price, or your own API key.

By size: Flagship (80B+) · Medium (14–34B) · Small (4–14B) · Tiny (under 4B)

By your GPU: 8 GB of VRAM · 16 GB of VRAM · 24 GB of VRAM · 48 GB of VRAM

By publisher: Qwen (Alibaba) · Meta Llama · DeepSeek · Google (Gemma) · Microsoft (Phi) · Mistral AI · NVIDIA · OpenAI (gpt-oss) · IBM Granite · Z.ai (GLM) · Moonshot AI (Kimi) · MiniMax · Cohere · Ai2 (OLMo) · Nous Research · Liquid AI · TII (Falcon) · EleutherAI · InternLM · Xiaomi (MiMo)

Flagship (80B+)

The biggest open models — multi-GPU or API territory. see all 75 →

DeepSeek-R1 — 671B, DeepSeek · $1.60/1M API
DeepSeek-V3 — 671B, DeepSeek · $0.50/1M API
Llama 3.1 405B Instruct — 405B, Meta · $0.80/1M API last-known
Kimi K2 Instruct — 1043B, Moonshot AI · $8.44/1M API est.
gpt-oss-120b — 117B, OpenAI · $0.10/1M API
Qwen3-235B-A22B — 235B, Alibaba · $1.14/1M API
DeepSeek R1 0528 — 684.5B, deepseek-ai · $1.33/1M API
Llama 4 Maverick (17B-128E) — 402B, Meta · $0.38/1M API last-known
Llama 4 Scout (17B-16E) — 109B, Meta · $0.20/1M API last-known
DeepSeek V4 Pro — 861.6B, deepseek-ai · $0.65/1M API
DeepSeek V3.2 — 685.4B, deepseek-ai · $0.33/1M API
Mistral Large 2 (2407) — 123B, Mistral AI · $1.08/1M API est.
DeepSeek V4 Flash — 158.1B, deepseek-ai · $0.21/1M API
MiniMax M2.7 — 228.7B, MiniMaxAI · $0.63/1M API
Kimi K2 Instruct 0905 — 1026.5B, moonshotai · $1.55/1M API
DeepSeek R1 0528 NVFP4 v2 — 393.6B, nvidia · $3.25/1M API est.
NVIDIA Nemotron 3 Super 120B A12B BF16 — 123.6B, nvidia · $1.09/1M API est.
DeepSeek V3 0324 — 684.5B, deepseek-ai · $5.58/1M API est.
Command R+ (08-2024) — 104B, Cohere · $0.93/1M API est.
MiniMax M2.5 — 228.7B, MiniMaxAI · $0.53/1M API
MiniMax M2.7 NVFP4 — 116.3B, nvidia · $1.03/1M API est.
Step 3.5 Flash — 199.4B, stepfun-ai · $1.70/1M API est.
NVIDIA Nemotron 3 Ultra 550B A55B NVFP4 — 335B, nvidia · $2.78/1M API est.
GLM 4.5 Air — 110.5B, zai-org · $0.49/1M API
Qwen3 Next 80B A3B Instruct — 81.3B, Qwen · $0.60/1M API
MiMo V2.5 — 310.8B, XiaomiMiMo · $2.59/1M API est.
Llama 3.1 405B — 405.9B, meta-llama · $3.35/1M API est.
DeepSeek V3.1 — 684.5B, deepseek-ai · $5.58/1M API est.
DeepSeek V3.2 Exp — 685.4B, deepseek-ai · $0.34/1M API
Qwen3 235B A22B NVFP4 — 132.8B, nvidia · $1.16/1M API est.
LLaDA2.1 flash — 102.9B, inclusionAI · $0.92/1M API est.
DeepSeek V4 Flash NVFP4 — 166.7B, nvidia · $1.43/1M API est.
Kimi K2 Thinking — 1058.1B, moonshotai · $1.55/1M API
GLM 4.5 — 358.3B, zai-org · $1.40/1M API
GLM 4.5 Air — 110.5B, unsloth · $0.98/1M API est.
Qwen3 235B A22B Instruct 2507 — 235.1B, Qwen · $1.98/1M API est.
GLM 5.1 — 753.9B, zai-org · $2.00/1M API
MiniMax M2 — 228.7B, MiniMaxAI · $0.64/1M API
GLM 5.1 NVFP4 — 381.5B, nvidia · $3.15/1M API est.
gpt oss 120b — 120.4B, unsloth · $1.06/1M API est.
MiniMax M2.7 REAP 172B A10B NVFP4 GB10 — 97.6B, scottgl · $0.88/1M API est.
NVIDIA Nemotron 3 Ultra 550B A55B BF16 — 560.5B, nvidia · $4.58/1M API est.
DeepSeek V4 Pro NVFP4 — 910B, nvidia · $7.38/1M API est.
MiniMax M2.5 NVFP4 — 116.3B, nvidia · $1.03/1M API est.
GLM 5.2 NVFP4 — 381B, nvidia · $3.15/1M API est.
GLM 5 NVFP4 — 435.2B, nvidia · $3.58/1M API est.
DeepSeek V4 Flash DSpark — 165.3B, deepseek-ai · $1.42/1M API est.
Ornith 1.0 397B — 396.8B, deepreinforce-ai · $3.27/1M API est.

Large (34–80B)

Top quality that still fits a single big rented GPU.

Llama 3.1 70B Instruct — 70B, Meta · $0.40/1M API
Llama 3.3 70B Instruct — 70B, Meta · $0.27/1M API
Mixtral 8x7B Instruct — 46.7B, Mistral AI · $0.24/1M API last-known
Qwen2.5 72B Instruct — 72B, Alibaba · $0.68/1M API est.
dolphin 2.9.1 yi 1.5 34b — 34.4B, dphn · $0.38/1M API est.
NVIDIA Nemotron 3 Super 120B A12B NVFP4 — 67.2B, nvidia · $0.64/1M API est.
Qwen3 Coder Next — 79.7B, Qwen · $0.46/1M API
Llama 3 3 Nemotron Super 49B v1 5 — 49.9B, nvidia · $0.50/1M API est.
Qwen 72B — 72.3B, Qwen · $0.68/1M API est.
Qwen3.6 35B A3B abliterated v4 — 34.7B, Bahushruth · $0.38/1M API est.
Qwen2.5 72B Instruct abliterated — 72.7B, huihui-ai · $0.68/1M API est.
Yi-1.5-34B-Chat — 34.4B, 01.AI · $0.38/1M API est.
Meta Llama 3 70B — 70.6B, meta-llama · $0.66/1M API est.
Karnak 40B v1.0 — 40.7B, Applied-Innovation-Center · $0.43/1M API est.
Qwen3.5 122B A10B NVFP4 — 64.4B, txn545 · $0.62/1M API est.
Laguna S 2.1 NVFP4 — 67.9B, poolside · $0.64/1M API est.
Llama 3 3 Nemotron Super 49B v1 — 49.9B, nvidia · $0.50/1M API est.
Phi 3.5 MoE instruct — 41.9B, microsoft · $0.44/1M API est.
Meta Llama 3.1 70B Instruct quantized.w4a16 — 70.6B, RedHatAI · $0.66/1M API est.
dolphin 2.9.1 llama 3 70b — 70.6B, dphn · $0.66/1M API est.
StableBeluga2 — 69B, petals-team · $0.65/1M API est.
Kimi Linear 48B A3B Instruct — 49.1B, moonshotai · $0.49/1M API est.
DeepSeek R1 Distill Llama 70B — 70.6B, deepseek-ai · $0.80/1M API
Llama 2 70b chat hf — 69B, meta-llama · $0.65/1M API est.
Llama 3.1 70B — 70.6B, meta-llama · $0.66/1M API est.
Meta Llama 3 70B Instruct — 70.6B, meta-llama · $0.66/1M API est.
NVIDIA Nemotron Labs 3 Puzzle 75B A9B NVFP4 — 44.5B, nvidia · $0.46/1M API est.
Hermes 3 Llama 3.1 70B — 70.6B, NousResearch · $0.70/1M API
IQuest Coder V1 40B Instruct — 39.8B, IQuestLab · $0.42/1M API est.
Agents A1 — 35.1B, InternScience · $0.38/1M API est.
Seed OSS 36B Instruct — 36.2B, ByteDance-Seed · $0.39/1M API est.
Hermes 4.3 36B — 36.2B, NousResearch · $0.39/1M API est.
Qwen AgentWorld 35B A3B — 34.7B, Qwen · $0.38/1M API est.
Qwen3.6 35B A3B DSV4Pro Thinking Distill — 34.7B, nerkyor · $0.38/1M API est.
Qwen3.5 122B A10B heretic MTP NVFP4 — 73.7B, OptimizeLLM · $0.69/1M API est.
Apertus 70B MeditronFO — 70.6B, EPFLiGHT · $0.66/1M API est.
Ornith 1.0 35B MTPLX — 34.7B, wang-yang · $0.38/1M API est.
Nex N2 mini — 35.1B, nex-agi · $0.06/1M API
Qwen3.6 35B A3B Claude 4.7 Opus Reasoning Distilled — 36B, lordx64 · $0.39/1M API est.
Qwopus3.6 35B A3B Coder — 36B, Jackrong · $0.39/1M API est.
Huihui Qwen3.6 35B A3B Claude 4.7 Opus abliterated — 36B, huihui-ai · $0.39/1M API est.
Apodex 1.0 mini — 36B, apodex · $0.39/1M API est.
Qwen3.5 122B A10B NVFP4 — 64.6B, nvidia · $0.62/1M API est.
Nemotron Labs TwoTower 30B A3B Base BF16 — 63.2B, nvidia · $0.61/1M API est.

Medium (14–34B)

The single-GPU sweet spot — strong and self-hostable. see all 98 →

Qwen2.5-Coder 32B Instruct — 32B, Alibaba · $0.83/1M API
gpt-oss-20b — 21B, OpenAI · $0.09/1M API
DeepSeek-R1-Distill-Qwen-32B — 32.5B, DeepSeek · $0.36/1M API est.
Qwen3-32B — 32.8B, Alibaba · $0.18/1M API
Gemma 3 27B — 27B, Google · $0.27/1M API
Phi-4 — 14B, Microsoft · $0.11/1M API
Gemma 2 27B Instruct — 27B, Google · $0.65/1M API
Mistral Small 3 (24B, 2501) — 23.6B, Mistral AI · $0.07/1M API
Qwen2.5 Coder 14B Instruct — 14.8B, Qwen · $0.22/1M API est.
Qwen3 30B A3B — 30.5B, Qwen · $0.31/1M API
Qwen2.5 14B Instruct — 14.8B, Qwen · $0.22/1M API est.
Qwen3.6 35B A3B NVFP4 — 18.7B, nvidia · $0.25/1M API est.
Qwen3 14B — 14.8B, Qwen · $0.57/1M API
Gemma 4 31B IT NVFP4 — 20.9B, nvidia · $0.27/1M API est.
Qwen3 Coder 30B A3B Instruct — 30.5B, Qwen · $0.17/1M API
GLM 4.7 Flash — 31.2B, zai-org · $0.23/1M API
NVIDIA Nemotron 3 Nano 30B A3B BF16 — 31.6B, nvidia · $0.35/1M API est.
DeepSeek-Coder-V2-Lite Instruct — 15.7B, DeepSeek · $0.23/1M API est.
Gemma 4 26B A4B NVFP4 — 14.4B, nvidia · $0.22/1M API est.
OTel 2.0 LLM 31B IT — 32.1B, farbodtavakkoli · $0.36/1M API est.
DeepSeek V2 Lite Chat — 15.7B, deepseek-ai · $0.23/1M API est.
Qwen2.5 32B Instruct — 32.8B, Qwen · $0.36/1M API est.
Qwen3 30B A3B Instruct 2507 — 30.5B, Qwen · $0.12/1M API
NVIDIA Nemotron 3 Nano 30B A3B NVFP4 — 18.2B, nvidia · $0.25/1M API est.
gpt neox 20b — 20.7B, EleutherAI · $0.27/1M API est.
Qwen3.6 27B NVFP4 — 18.2B, nvidia · $0.25/1M API est.
granite 4.0 h small — 32.2B, ibm-granite · $0.36/1M API est.
DeepSeek R1 Distill Qwen 14B — 14.8B, deepseek-ai · $0.22/1M API est.
Qwen3.6 27B Text NVFP4 MTP — 16.7B, sakamakismile · $0.23/1M API est.
Qwen3 30B A3B abliterated — 30.5B, mlabonne · $0.34/1M API est.
GLM 4.7 Flash — 31.2B, unsloth · $0.35/1M API est.
DeepSeek V2 Lite — 15.7B, deepseek-ai · $0.23/1M API est.
droplychee 1.0 27b — 27.8B, droplychee · $0.32/1M API est.
Laguna XS.2 — 33.4B, poolside · $0.15/1M API last-known
diffusiongemma 26B A4B it NVFP4 — 14.4B, nvidia · $0.22/1M API est.
gemma 4 26B A4B it uncensored — 25.8B, TrevorJS · $0.31/1M API est.
GLM 4.7 Flash NVFP4 — 18.4B, GadflyII · $0.25/1M API est.
Qwen3 30B A3B Thinking 2507 — 30.5B, Qwen · $0.85/1M API
LLaDA2.0 mini — 16.3B, inclusionAI · $0.23/1M API est.
gemma 4 31B it NVFP4 turbo — 32.5B, LilaRest · $0.36/1M API est.
Nemotron 3 Nano 30B A3B — 31.6B, unsloth · $0.35/1M API est.
Qwen1.5 MoE A2.7B — 14.3B, Qwen · $0.21/1M API est.
Gemma 4 26B A4B it NVFP4 — 15.1B, bg-digitalservices · $0.22/1M API est.
Agents A1 NVFP4 — 18.9B, r0b0tlab · $0.25/1M API est.
Tongyi DeepResearch 30B A3B — 30.5B, Alibaba-NLP · $0.34/1M API est.
Qwen2.5 14B Instruct — 14.8B, unsloth · $0.22/1M API est.
phi 4 quantized.w4a16 — 14.8B, RedHatAI · $0.22/1M API est.
Gemma 4 Garnet V2 31B it ultra uncensored heretic — 31.3B, llmfan46 · $0.35/1M API est.

Small (4–14B)

Runs on a good laptop or consumer GPU. see all 169 →

Llama 3.1 8B Instruct — 8B, Meta · $0.07/1M API
Qwen2.5 7B Instruct — 7B, Alibaba · $0.07/1M API
Mistral 7B Instruct v0.3 — 7.2B, Mistral AI · $0.20/1M API last-known
Qwen2.5-Coder 7B Instruct — 7B, Alibaba · $0.16/1M API est.
Qwen3 4B — 4B, Qwen · $0.13/1M API est.
Gemma 2 9B Instruct — 9B, Google · $0.06/1M API last-known
Qwen3-8B — 8.2B, Alibaba · $0.29/1M API
Gemma 3 12B — 12B, Google · $0.10/1M API
Qwen3 4B Instruct 2507 — 4B, Qwen · $0.13/1M API est.
Qwen2-VL 7B Instruct — 8B, Alibaba · $0.16/1M API est.
Rio 3.0 Open Mini — 4B, prefeitura-rio · $0.13/1M API est.
Hermes 3 — Llama 3.1 8B — 8B, Nous Research · $0.16/1M API est.
Meta Llama 3 8B Instruct — 8B, meta-llama · $0.16/1M API est.
Mistral 7B Instruct v0.2 — 7.2B, mistralai · $0.16/1M API est.
Llama 3.1 8B — 8B, meta-llama · $0.16/1M API est.
Meta Llama 3 8B — 8B, meta-llama · $0.16/1M API est.
NVIDIA Nemotron 3 Nano 4B BF16 — 4B, nvidia · $0.13/1M API est.
Llama 3.1 8B Instruct (Abliterated) — 8B, mlabonne (community) · $0.16/1M API est.
LLaDA 8B Instruct — 8B, GSAI-ML · $0.16/1M API est.
granite 4.1 8b — 8.8B, ibm-granite · $0.08/1M API
Mistral 7B v0.1 — 7.2B, mistralai · $0.16/1M API est.
Qwen3 4B Base — 4B, Qwen · $0.13/1M API est.
Dolphin 3.0 — Llama 3.1 8B — 8B, Cognitive Computations · $0.16/1M API est.
Llama 2 7b hf — 6.7B, meta-llama · $0.15/1M API est.
Qwen2 7B Instruct — 7.6B, Qwen · $0.16/1M API est.
Qwen3 4B Thinking 2507 — 4B, Qwen · $0.13/1M API est.
Qwen2.5 7B — 7.6B, Qwen · $0.16/1M API est.
NVIDIA Nemotron Nano 9B v2 — 8.9B, nvidia · $0.17/1M API est.
Bielik 11B v3.0 Instruct — 11.2B, speakleash · $0.19/1M API est.
Nemotron Labs Diffusion 8B Base — 8.5B, nvidia · $0.17/1M API est.
deepseek coder 7b instruct v1.5 — 6.9B, deepseek-ai · $0.16/1M API est.
DeepSeek R1 Distill Llama 8B — 8B, deepseek-ai · $0.16/1M API est.
Qwen3 8B Base — 8.2B, Qwen · $0.17/1M API est.
saiga llama3 8b — 8B, IlyaGusev · $0.16/1M API est.
DeepSeek R1 Distill Qwen 7B — 7.6B, deepseek-ai · $0.16/1M API est.
Dream v0 Instruct 7B — 7.6B, Dream-org · $0.16/1M API est.
Qwen2.5 Coder 7B — 7.6B, Qwen · $0.16/1M API est.
VLM2Vec Full — 4.1B, TIGER-Lab · $0.13/1M API est.
Meta Llama 3.1 8B Instruct — 8B, unsloth · $0.16/1M API est.
falcon 7b — 7.2B, tiiuae · $0.16/1M API est.
DeepSeek R1 0528 Qwen3 8B — 8.2B, deepseek-ai · $0.17/1M API est.
gemma 4 E4B it OBLITERATED — 8B, OBLITERATUS · $0.16/1M API est.
Darwin 9B NEG — 9.7B, ansulev · $0.18/1M API est.
Vikhr Nemo 12B Instruct R 21 09 24 — 12.2B, Vikhrmodels · $0.20/1M API est.
Qwen3Guard Gen 4B — 4.4B, Qwen · $0.14/1M API est.
Mistral 7B Instruct v0.1 — 7.2B, mistralai · $0.16/1M API est.
Llama 2 7b chat hf — 6.7B, meta-llama · $0.15/1M API est.
Phi 3 vision 128k instruct — 4.1B, microsoft · $0.13/1M API est.

Tiny (under 4B)

Edge / on-device — runs almost anywhere. see all 141 →

Qwen3 0.6B — 800M, Qwen · $0.11/1M API est.
Qwen2.5 3B Instruct — 3.1B, Qwen · $0.12/1M API est.
Llama 3.2 3B Instruct — 3B, Meta · $0.19/1M API
Qwen2.5 1.5B Instruct — 1.5B, Qwen · $0.11/1M API est.
gemma 3 270m — 300M, google · $0.10/1M API est.
Qwen3 1.7B — 2B, Qwen · $0.12/1M API est.
BGE-M3 — 567M, BAAI · $0.10/1M API est.
Qwen2.5 0.5B Instruct — 500M, Qwen · $0.10/1M API est.
Qwen2 1.5B Instruct — 1.5B, Qwen · $0.11/1M API est.
Llama 3.2 1B Instruct — 1.2B, Meta · $0.11/1M API
Llama 3.2 1B — 1.2B, meta-llama · $0.11/1M API est.
Qwen2.5 0.5B — 500M, Qwen · $0.10/1M API est.
Phi-3.5-mini Instruct — 3.8B, Microsoft · $0.13/1M API est.
gemma 3 1b it — 1B, google · $0.11/1M API est.
TinyLlama 1.1B Chat v1.0 — 1.1B, TinyLlama · $0.11/1M API est.
gpt2 large — 800M, openai-community · $0.11/1M API est.
OpenELM 1 1B Instruct — 1.1B, apple · $0.11/1M API est.
PowerMoE 3b — 3.4B, ibm-research · $0.13/1M API est.
Phi 4 mini instruct — 3.8B, microsoft · $0.22/1M API last-known
Qwen2.5 1.5B — 1.5B, Qwen · $0.11/1M API est.
h2ovl mississippi 800m — 800M, h2oai · $0.11/1M API est.
h2ovl mississippi 2b — 2.2B, h2oai · $0.12/1M API est.
Nomic Embed Text v1.5 — 137M, Nomic AI · $0.10/1M API est.
Qwen2 0.5B — 500M, Qwen · $0.10/1M API est.
SmolLM 1.7B Instruct quantized.w4a16 — 1.8B, nm-testing · $0.11/1M API est.
Qwen2.5 1.5B quantized.w8a8 — 1.8B, RedHatAI · $0.11/1M API est.
Qwen2.5 Math 1.5B — 1.5B, Qwen · $0.11/1M API est.
gpt neo 2.7B — 2.7B, EleutherAI · $0.12/1M API est.
Llama 3.2 3B — 3.2B, meta-llama · $0.13/1M API est.
Qwen2 0.5B Instruct — 500M, Qwen · $0.10/1M API est.
Phi tiny MoE instruct — 3.8B, microsoft · $0.13/1M API est.
Qwen2.5 Coder 1.5B Instruct — 1.5B, Qwen · $0.11/1M API est.
Qwen2.5 Coder 3B — 3.1B, Qwen · $0.12/1M API est.
Qwen3 1.7B Base — 1.7B, Qwen · $0.11/1M API est.
DeepSeek R1 Distill Qwen 1.5B — 1.8B, deepseek-ai · $0.11/1M API est.
Phi 3 mini 4k instruct — 3.8B, microsoft · $0.13/1M API est.
Qwen3 0.6B Base — 600M, Qwen · $0.10/1M API est.
SmolLM3 3B — 3.1B, HuggingFaceTB · $0.12/1M API est.
Qwen2.5 3B — 3.1B, Qwen · $0.12/1M API est.
bloom 560m — 600M, bigscience · $0.10/1M API est.
Qwen3Guard Gen 0.6B — 800M, Qwen · $0.11/1M API est.
phi 2 — 2.8B, microsoft · $0.12/1M API est.
gpt2 medium — 400M, openai-community · $0.10/1M API est.
Llama 3.2 1B Instruct — 1.2B, unsloth · $0.11/1M API est.
Zamba2 1.2B instruct — 1.2B, Zyphra · $0.11/1M API est.
OLMo 2 0425 1B — 1.5B, allenai · $0.11/1M API est.
SmolLM2 360M Instruct — 400M, HuggingFaceTB · $0.10/1M API est.
gemma 2 2b it — 2.6B, google · $0.12/1M API est.

Compare models head-to-head → · Outcome Lab →

The weekly price index

A short email of real AI price moves, straight from the daily log — no hype. We're collecting the list now; the first issue goes out when it opens. Unsubscribe with one click.

Joining the list needs JavaScript — or just email support@spanvero.com and we'll add you.