Home › Models All AI models Browse 354 open models by size, by publisher, or by the GPU you have. Every one shows the genuinely cheapest way to run it — free on your machine, a rented GPU at the vendor's price, or your own API key.
By your GPU: 8 GB of VRAM · 16 GB of VRAM · 24 GB of VRAM · 48 GB of VRAM
By publisher: Qwen (Alibaba) · Meta Llama · DeepSeek · Google (Gemma) · Microsoft (Phi) · Mistral AI · NVIDIA · OpenAI (gpt-oss) · IBM Granite · Z.ai (GLM) · Moonshot AI (Kimi) · MiniMax · Cohere · Ai2 (OLMo) · Nous Research · Liquid AI · TII (Falcon) · EleutherAI · InternLM · Xiaomi (MiMo)
Flagship (80B+) The biggest open models — multi-GPU or API territory.
DeepSeek-R1 — 671B, DeepSeek · $1.60/1M APIDeepSeek-V3 — 671B, DeepSeek · $0.50/1M APILlama 3.1 405B Instruct — 405B, Meta · $0.80/1M APIKimi K2 Instruct — 1043B, Moonshot AI · $8.44/1M APIgpt-oss-120b — 117B, OpenAI · $0.11/1M APIQwen3-235B-A22B — 235B, Alibaba · $1.14/1M APIDeepSeek R1 0528 — 684.5B, deepseek-ai · $1.33/1M APILlama 4 Maverick (17B-128E) — 402B, Meta · $0.38/1M APILlama 4 Scout (17B-16E) — 109B, Meta · $0.20/1M APIDeepSeek V4 Pro — 861.6B, deepseek-ai · $0.66/1M APIDeepSeek V3.2 — 685.4B, deepseek-ai · $5.58/1M APIMistral Large 2 (2407) — 123B, Mistral AI · $1.08/1M APIDeepSeek V4 Flash — 158.1B, deepseek-ai · $1.36/1M APIMiniMax M2.7 — 228.7B, MiniMaxAI · $1.93/1M APIKimi K2 Instruct 0905 — 1026.5B, moonshotai · $1.55/1M APIDeepSeek R1 0528 NVFP4 v2 — 393.6B, nvidia · $3.25/1M APINVIDIA Nemotron 3 Super 120B A12B BF16 — 123.6B, nvidia · $1.09/1M APIDeepSeek V3 0324 — 684.5B, deepseek-ai · $5.58/1M APICommand R+ (08-2024) — 104B, Cohere · $0.93/1M APIMiniMax M2.5 — 228.7B, MiniMaxAI · $1.93/1M APIMiniMax M2.7 NVFP4 — 116.3B, nvidia · $1.03/1M APIStep 3.5 Flash — 199.4B, stepfun-ai · $1.70/1M APINVIDIA Nemotron 3 Ultra 550B A55B NVFP4 — 335B, nvidia · $2.78/1M APIGLM 4.5 Air — 110.5B, zai-org · $0.98/1M APIQwen3 Next 80B A3B Instruct — 81.3B, Qwen · $0.75/1M APILlama 3.1 405B — 405.9B, meta-llama · $3.35/1M APIDeepSeek V3.1 — 684.5B, deepseek-ai · $5.58/1M APIDeepSeek V3.2 Exp — 685.4B, deepseek-ai · $0.32/1M APILLaDA2.1 flash — 102.9B, inclusionAI · $0.92/1M APIDeepSeek V4 Flash NVFP4 — 166.7B, nvidia · $1.43/1M APIKimi K2 Thinking — 1058.1B, moonshotai · $1.55/1M APIGLM 4.5 — 358.3B, zai-org · $2.97/1M APIQwen3 235B A22B Instruct 2507 — 235.1B, Qwen · $1.98/1M APIGLM 5.1 — 753.9B, zai-org · $2.03/1M APIMiniMax M2 — 228.7B, MiniMaxAI · $0.63/1M APINVIDIA Nemotron 3 Ultra 550B A55B BF16 — 560.5B, nvidia · $4.58/1M APIDeepSeek V4 Pro NVFP4 — 910B, nvidia · $7.38/1M APIMiniMax M2.5 NVFP4 — 116.3B, nvidia · $1.03/1M APIMiMo V2.5 Pro — 1023.2B, XiaomiMiMo · $8.29/1M APILongCat Flash Chat — 561.9B, meituan-longcat · $4.60/1M APIHy3 preview — 298.8B, tencent · $2.49/1M APIGLM 5 — 753.9B, zai-org · $6.13/1M APIMiMo V2 Flash — 309.8B, XiaomiMiMo · $2.58/1M APIGLM 4.7 — 358.3B, zai-org · $1.08/1M APIsarvam 105b — 106B, sarvamai · $0.95/1M APIQwen3 Coder 480B A35B Instruct — 480.2B, Qwen · $0.61/1M APIGLM 4.6 — 356.8B, zai-org · $1.09/1M APILarge (34–80B) Top quality that still fits a single big rented GPU.
Llama 3.1 70B Instruct — 70B, Meta · $0.40/1M APILlama 3.3 70B Instruct — 70B, Meta · $0.66/1M APIMixtral 8x7B Instruct — 46.7B, Mistral AI · $0.24/1M APIQwen2.5 72B Instruct — 72B, Alibaba · $0.68/1M APIdolphin 2.9.1 yi 1.5 34b — 34.4B, dphn · $0.38/1M APINVIDIA Nemotron 3 Super 120B A12B NVFP4 — 67.2B, nvidia · $0.64/1M APIQwen3 Coder Next — 79.7B, Qwen · $0.74/1M APILlama 3 3 Nemotron Super 49B v1 5 — 49.9B, nvidia · $0.50/1M APIQwen2.5 72B Instruct abliterated — 72.7B, huihui-ai · $0.68/1M APIYi-1.5-34B-Chat — 34.4B, 01.AI · $0.38/1M APIMeta Llama 3 70B — 70.6B, meta-llama · $0.66/1M APIKarnak 40B v1.0 — 40.7B, Applied-Innovation-Center · $0.43/1M APILlama 3 3 Nemotron Super 49B v1 — 49.9B, nvidia · $0.50/1M APIPhi 3.5 MoE instruct — 41.9B, microsoft · $0.44/1M APIMeta Llama 3.1 70B Instruct quantized.w4a16 — 70.6B, RedHatAI · $0.66/1M APIStableBeluga2 — 69B, petals-team · $0.65/1M APIKimi Linear 48B A3B Instruct — 49.1B, moonshotai · $0.49/1M APIDeepSeek R1 Distill Llama 70B — 70.6B, deepseek-ai · $0.66/1M APILlama 2 70b chat hf — 69B, meta-llama · $0.65/1M APILlama 3.1 70B — 70.6B, meta-llama · $0.66/1M APIHermes 3 Llama 3.1 70B — 70.6B, NousResearch · $0.66/1M APISeed OSS 36B Instruct — 36.2B, ByteDance-Seed · $0.39/1M APIQwen3.5 122B A10B heretic MTP NVFP4 — 73.7B, OptimizeLLM · $0.69/1M APIHuihui Qwen3.6 35B A3B Claude 4.7 Opus abliterated — 36B, huihui-ai · $0.39/1M APIQwen3.5 122B A10B NVFP4 — 64.6B, nvidia · $0.62/1M APIMedium (14–34B) The single-GPU sweet spot — strong and self-hostable.
Qwen2.5-Coder 32B Instruct — 32B, Alibaba · $0.83/1M APIgpt-oss-20b — 21B, OpenAI · $0.09/1M APIDeepSeek-R1-Distill-Qwen-32B — 32.5B, DeepSeek · $0.36/1M APIQwen3-32B — 32.8B, Alibaba · $0.36/1M APIGemma 3 27B — 27B, Google · $0.32/1M APIPhi-4 — 14B, Microsoft · $0.21/1M APIGemma 2 27B Instruct — 27B, Google · $0.65/1M APIMistral Small 3 (24B, 2501) — 23.6B, Mistral AI · $0.29/1M APIQwen2.5 Coder 14B Instruct — 14.8B, Qwen · $0.22/1M APIQwen3 30B A3B — 30.5B, Qwen · $0.34/1M APIQwen2.5 14B Instruct — 14.8B, Qwen · $0.22/1M APIQwen3.6 35B A3B NVFP4 — 18.7B, nvidia · $0.25/1M APIQwen3 14B — 14.8B, Qwen · $0.22/1M APIGemma 4 31B IT NVFP4 — 20.9B, nvidia · $0.27/1M APIQwen3 Coder 30B A3B Instruct — 30.5B, Qwen · $0.34/1M APIGLM 4.7 Flash — 31.2B, zai-org · $0.35/1M APINVIDIA Nemotron 3 Nano 30B A3B BF16 — 31.6B, nvidia · $0.35/1M APIDeepSeek-Coder-V2-Lite Instruct — 15.7B, DeepSeek · $0.23/1M APIGemma 4 26B A4B NVFP4 — 14.4B, nvidia · $0.22/1M APIDeepSeek V2 Lite Chat — 15.7B, deepseek-ai · $0.23/1M APIQwen2.5 32B Instruct — 32.8B, Qwen · $0.36/1M APIQwen3 30B A3B Instruct 2507 — 30.5B, Qwen · $0.34/1M APINVIDIA Nemotron 3 Nano 30B A3B NVFP4 — 18.2B, nvidia · $0.25/1M APIgpt neox 20b — 20.7B, EleutherAI · $0.27/1M APIgranite 4.0 h small — 32.2B, ibm-granite · $0.36/1M APIQwen3.6 27B Text NVFP4 MTP — 16.7B, sakamakismile · $0.23/1M APIQwen3 30B A3B abliterated — 30.5B, mlabonne · $0.34/1M APIGLM 4.7 Flash — 31.2B, unsloth · $0.35/1M APIDeepSeek V2 Lite — 15.7B, deepseek-ai · $0.23/1M APILaguna XS.2 — 33.4B, poolside · $0.37/1M APIdiffusiongemma 26B A4B it NVFP4 — 14.4B, nvidia · $0.22/1M APIgemma 4 26B A4B it uncensored — 25.8B, TrevorJS · $0.31/1M APIGLM 4.7 Flash NVFP4 — 18.4B, GadflyII · $0.25/1M APIQwen3 30B A3B Thinking 2507 — 30.5B, Qwen · $0.34/1M APILLaDA2.0 mini — 16.3B, inclusionAI · $0.23/1M APIgemma 4 31B it NVFP4 turbo — 32.5B, LilaRest · $0.36/1M APINemotron 3 Nano 30B A3B — 31.6B, unsloth · $0.35/1M APIQwen1.5 MoE A2.7B — 14.3B, Qwen · $0.21/1M APIGemma 4 26B A4B it NVFP4 — 15.1B, bg-digitalservices · $0.22/1M APIHyperCLOVAX SEED Think 32B — 33.3B, naver-hyperclovax · $0.37/1M APIEuroLLM 22B Instruct 2512 — 22.6B, utter-project · $0.28/1M APIlynx instruct 30b — 30.5B, bineric · $0.34/1M APIcogito v1 preview qwen 32B — 32.8B, deepcogito · $0.36/1M APIQwen3 30B A3B NVFP4 — 17.5B, RedHatAI · $0.24/1M APIHuihui Qwen3.6 27B abliterated NVFP4 MTP — 17.1B, sakamakismile · $0.24/1M APIQwen3 32B NVFP4 — 17.2B, nvidia · $0.24/1M APIQwen3.6 27B AEON Ultimate Uncensored NVFP4 — 19.1B, AEON-7 · $0.25/1M APIQwen3 14B Base — 14.8B, Qwen · $0.22/1M APISmall (4–14B) Runs on a good laptop or consumer GPU.
Llama 3.1 8B Instruct — 8B, Meta · $0.03/1M APIQwen2.5 7B Instruct — 7B, Alibaba · $0.07/1M APIMistral 7B Instruct v0.3 — 7.2B, Mistral AI · $0.20/1M APIQwen2.5-Coder 7B Instruct — 7B, Alibaba · $0.16/1M APIQwen3 4B — 4B, Qwen · $0.13/1M APIGemma 2 9B Instruct — 9B, Google · $0.06/1M APIQwen3-8B — 8.2B, Alibaba · $0.17/1M APIGemma 3 12B — 12B, Google · $0.20/1M APIQwen3 4B Instruct 2507 — 4B, Qwen · $0.13/1M APIQwen2-VL 7B Instruct — 8B, Alibaba · $0.16/1M APIRio 3.0 Open Mini — 4B, prefeitura-rio · $0.13/1M APIHermes 3 — Llama 3.1 8B — 8B, Nous Research · $0.16/1M APIMeta Llama 3 8B Instruct — 8B, meta-llama · $0.16/1M APIMistral 7B Instruct v0.2 — 7.2B, mistralai · $0.16/1M APILlama 3.1 8B — 8B, meta-llama · $0.16/1M APIMeta Llama 3 8B — 8B, meta-llama · $0.16/1M APINVIDIA Nemotron 3 Nano 4B BF16 — 4B, nvidia · $0.13/1M APILlama 3.1 8B Instruct (Abliterated) — 8B, mlabonne (community) · $0.16/1M APILLaDA 8B Instruct — 8B, GSAI-ML · $0.16/1M APIMistral 7B v0.1 — 7.2B, mistralai · $0.16/1M APIQwen3 4B Base — 4B, Qwen · $0.13/1M APIDolphin 3.0 — Llama 3.1 8B — 8B, Cognitive Computations · $0.16/1M APILlama 2 7b hf — 6.7B, meta-llama · $0.15/1M APIQwen2 7B Instruct — 7.6B, Qwen · $0.16/1M APIQwen3 4B Thinking 2507 — 4B, Qwen · $0.13/1M APIQwen2.5 7B — 7.6B, Qwen · $0.16/1M APINVIDIA Nemotron Nano 9B v2 — 8.9B, nvidia · $0.17/1M APINemotron Labs Diffusion 8B Base — 8.5B, nvidia · $0.17/1M APIdeepseek coder 7b instruct v1.5 — 6.9B, deepseek-ai · $0.16/1M APIDeepSeek R1 Distill Llama 8B — 8B, deepseek-ai · $0.16/1M APIQwen3 8B Base — 8.2B, Qwen · $0.17/1M APIsaiga llama3 8b — 8B, IlyaGusev · $0.16/1M APIDeepSeek R1 Distill Qwen 7B — 7.6B, deepseek-ai · $0.16/1M APIDream v0 Instruct 7B — 7.6B, Dream-org · $0.16/1M APIQwen2.5 Coder 7B — 7.6B, Qwen · $0.16/1M APIVLM2Vec Full — 4.1B, TIGER-Lab · $0.13/1M APIMeta Llama 3.1 8B Instruct — 8B, unsloth · $0.16/1M APIfalcon 7b — 7.2B, tiiuae · $0.16/1M APIDeepSeek R1 0528 Qwen3 8B — 8.2B, deepseek-ai · $0.17/1M APIgemma 4 E4B it OBLITERATED — 8B, OBLITERATUS · $0.16/1M APIDarwin 9B NEG — 9.7B, ansulev · $0.18/1M APIMistral 7B Instruct v0.1 — 7.2B, mistralai · $0.16/1M APILlama 2 7b chat hf — 6.7B, meta-llama · $0.15/1M APIPhi 3 vision 128k instruct — 4.1B, microsoft · $0.13/1M APICodeLlama 7b hf — 6.7B, codellama · $0.15/1M APINVIDIA Nemotron Nano 9B v2 Japanese — 8.9B, nvidia · $0.17/1M APIQwen2.5 7B Instruct (Abliterated) — 7B, huihui-ai (community) · $0.16/1M APIllama 7b — 6.7B, huggyllama · $0.15/1M APITiny (under 4B) Edge / on-device — runs almost anywhere.
Qwen3 0.6B — 800M, Qwen · $0.11/1M APIQwen2.5 3B Instruct — 3.1B, Qwen · $0.12/1M APILlama 3.2 3B Instruct — 3B, Meta · $0.12/1M APIQwen2.5 1.5B Instruct — 1.5B, Qwen · $0.11/1M APIgemma 3 270m — 300M, google · $0.10/1M APIQwen3 1.7B — 2B, Qwen · $0.12/1M APIBGE-M3 — 567M, BAAI · $0.10/1M APIQwen2.5 0.5B Instruct — 500M, Qwen · $0.10/1M APIQwen2 1.5B Instruct — 1.5B, Qwen · $0.11/1M APILlama 3.2 1B Instruct — 1.2B, Meta · $0.11/1M APILlama 3.2 1B — 1.2B, meta-llama · $0.11/1M APIQwen2.5 0.5B — 500M, Qwen · $0.10/1M APIPhi-3.5-mini Instruct — 3.8B, Microsoft · $0.13/1M APIgemma 3 1b it — 1B, google · $0.11/1M APITinyLlama 1.1B Chat v1.0 — 1.1B, TinyLlama · $0.11/1M APIgpt2 large — 800M, openai-community · $0.11/1M APIOpenELM 1 1B Instruct — 1.1B, apple · $0.11/1M APIPowerMoE 3b — 3.4B, ibm-research · $0.13/1M APIPhi 4 mini instruct — 3.8B, microsoft · $0.13/1M APIQwen2.5 1.5B — 1.5B, Qwen · $0.11/1M APIh2ovl mississippi 800m — 800M, h2oai · $0.11/1M APIh2ovl mississippi 2b — 2.2B, h2oai · $0.12/1M APINomic Embed Text v1.5 — 137M, Nomic AI · $0.10/1M APIQwen2 0.5B — 500M, Qwen · $0.10/1M APISmolLM 1.7B Instruct quantized.w4a16 — 1.8B, nm-testing · $0.11/1M APIQwen2.5 1.5B quantized.w8a8 — 1.8B, RedHatAI · $0.11/1M APIQwen2.5 Math 1.5B — 1.5B, Qwen · $0.11/1M APIgpt neo 2.7B — 2.7B, EleutherAI · $0.12/1M APILlama 3.2 3B — 3.2B, meta-llama · $0.13/1M APIQwen2 0.5B Instruct — 500M, Qwen · $0.10/1M APIPhi tiny MoE instruct — 3.8B, microsoft · $0.13/1M APIQwen2.5 Coder 1.5B Instruct — 1.5B, Qwen · $0.11/1M APIQwen2.5 Coder 3B — 3.1B, Qwen · $0.12/1M APIQwen3 1.7B Base — 1.7B, Qwen · $0.11/1M APIDeepSeek R1 Distill Qwen 1.5B — 1.8B, deepseek-ai · $0.11/1M APIPhi 3 mini 4k instruct — 3.8B, microsoft · $0.13/1M APIQwen3 0.6B Base — 600M, Qwen · $0.10/1M APISmolLM3 3B — 3.1B, HuggingFaceTB · $0.12/1M APIQwen2.5 3B — 3.1B, Qwen · $0.12/1M APIbloom 560m — 600M, bigscience · $0.10/1M APIQwen3Guard Gen 0.6B — 800M, Qwen · $0.11/1M APIphi 2 — 2.8B, microsoft · $0.12/1M APIgpt2 medium — 400M, openai-community · $0.10/1M APILlama 3.2 1B Instruct — 1.2B, unsloth · $0.11/1M APIZamba2 1.2B instruct — 1.2B, Zyphra · $0.11/1M APIOLMo 2 0425 1B — 1.5B, allenai · $0.11/1M APISmolLM2 360M Instruct — 400M, HuggingFaceTB · $0.10/1M APIgemma 2 2b it — 2.6B, google · $0.12/1M APICompare models head-to-head →
Open the free advisor → · Prices as of 2026-06-17. We're an honest advisor — $0 markup, your own accounts, we never resell compute. © 2026 Cynosure LLC.