Home › Models › Small (4–14B) Small (4–14B) AI models All 106 open small models. Runs on a good laptop or consumer GPU. The honest, $0-markup cost to run each — free on your machine, a rented GPU, or your own API key.
Llama 3.1 8B Instruct — 8B, Meta · $0.03/1M APIQwen2.5 7B Instruct — 7B, Alibaba · $0.07/1M APIMistral 7B Instruct v0.3 — 7.2B, Mistral AI · $0.20/1M API last-knownQwen2.5-Coder 7B Instruct — 7B, Alibaba · $0.16/1M API est.Qwen3 4B — 4B, Qwen · $0.13/1M API est.Gemma 2 9B Instruct — 9B, Google · $0.06/1M API last-knownQwen3-8B — 8.2B, Alibaba · $0.17/1M API est.Gemma 3 12B — 12B, Google · $0.20/1M API est.Qwen3 4B Instruct 2507 — 4B, Qwen · $0.13/1M API est.Qwen2-VL 7B Instruct — 8B, Alibaba · $0.16/1M API est.Rio 3.0 Open Mini — 4B, prefeitura-rio · $0.13/1M API est.Hermes 3 — Llama 3.1 8B — 8B, Nous Research · $0.16/1M API est.Meta Llama 3 8B Instruct — 8B, meta-llama · $0.16/1M API est.Mistral 7B Instruct v0.2 — 7.2B, mistralai · $0.16/1M API est.Llama 3.1 8B — 8B, meta-llama · $0.16/1M API est.Meta Llama 3 8B — 8B, meta-llama · $0.16/1M API est.NVIDIA Nemotron 3 Nano 4B BF16 — 4B, nvidia · $0.13/1M API est.Llama 3.1 8B Instruct (Abliterated) — 8B, mlabonne (community) · $0.16/1M API est.LLaDA 8B Instruct — 8B, GSAI-ML · $0.16/1M API est.Mistral 7B v0.1 — 7.2B, mistralai · $0.16/1M API est.Qwen3 4B Base — 4B, Qwen · $0.13/1M API est.Dolphin 3.0 — Llama 3.1 8B — 8B, Cognitive Computations · $0.16/1M API est.Llama 2 7b hf — 6.7B, meta-llama · $0.15/1M API est.Qwen2 7B Instruct — 7.6B, Qwen · $0.16/1M API est.Qwen3 4B Thinking 2507 — 4B, Qwen · $0.13/1M API est.Qwen2.5 7B — 7.6B, Qwen · $0.16/1M API est.NVIDIA Nemotron Nano 9B v2 — 8.9B, nvidia · $0.17/1M API est.Nemotron Labs Diffusion 8B Base — 8.5B, nvidia · $0.17/1M API est.deepseek coder 7b instruct v1.5 — 6.9B, deepseek-ai · $0.16/1M API est.DeepSeek R1 Distill Llama 8B — 8B, deepseek-ai · $0.16/1M API est.Qwen3 8B Base — 8.2B, Qwen · $0.17/1M API est.saiga llama3 8b — 8B, IlyaGusev · $0.16/1M API est.DeepSeek R1 Distill Qwen 7B — 7.6B, deepseek-ai · $0.16/1M API est.Dream v0 Instruct 7B — 7.6B, Dream-org · $0.16/1M API est.Qwen2.5 Coder 7B — 7.6B, Qwen · $0.16/1M API est.VLM2Vec Full — 4.1B, TIGER-Lab · $0.13/1M API est.Meta Llama 3.1 8B Instruct — 8B, unsloth · $0.16/1M API est.falcon 7b — 7.2B, tiiuae · $0.16/1M API est.DeepSeek R1 0528 Qwen3 8B — 8.2B, deepseek-ai · $0.17/1M API est.gemma 4 E4B it OBLITERATED — 8B, OBLITERATUS · $0.16/1M API est.Darwin 9B NEG — 9.7B, ansulev · $0.18/1M API est.Mistral 7B Instruct v0.1 — 7.2B, mistralai · $0.16/1M API est.Llama 2 7b chat hf — 6.7B, meta-llama · $0.15/1M API est.Phi 3 vision 128k instruct — 4.1B, microsoft · $0.13/1M API est.CodeLlama 7b hf — 6.7B, codellama · $0.15/1M API est.NVIDIA Nemotron Nano 9B v2 Japanese — 8.9B, nvidia · $0.17/1M API est.Qwen2.5 7B Instruct (Abliterated) — 7B, huihui-ai (community) · $0.16/1M API est.llama 7b — 6.7B, huggyllama · $0.15/1M API est.Phi mini MoE instruct — 7.6B, microsoft · $0.16/1M API est.Olmo 3 7B Instruct SFT — 7.3B, allenai · $0.16/1M API est.Apertus 70B Instruct 2509 quantized.w4a16 — 11.3B, RedHatAI · $0.19/1M API est.deepseek coder 6.7b instruct — 6.7B, deepseek-ai · $0.15/1M API est.wildguard — 7.2B, allenai · $0.16/1M API est.zephyr 7b beta — 7.2B, HuggingFaceH4 · $0.16/1M API est.MiMo 7B Base — 7.8B, XiaomiMiMo · $0.16/1M API est.LLaDA 1.5 — 8B, GSAI-ML · $0.16/1M API est.Qwen2.5 Math 7B Instruct — 7.6B, Qwen · $0.16/1M API est.NVIDIA Nemotron Nano 12B v2 — 12.3B, nvidia · $0.20/1M API est.Meta Llama 3 8B Instruct — 8B, NousResearch · $0.16/1M API est.Meta Llama 3.1 8B Instruct — 8B, NousResearch · $0.16/1M API est.EXAONE 3.5 7.8B Instruct — 7.8B, LGAI-EXAONE · $0.16/1M API est.Tarsier 7b — 7.1B, omni-research · $0.16/1M API est.Olmo 3 1025 7B — 7.3B, allenai · $0.16/1M API est.falcon 7b instruct — 7.2B, tiiuae · $0.16/1M API est.Qwen1.5 7B — 7.7B, Qwen · $0.16/1M API est.falcon mamba 7b — 7.3B, tiiuae · $0.16/1M API est.granite 4.0 h tiny — 6.9B, ibm-granite · $0.16/1M API est.granite 4.0 tiny preview — 6.7B, ibm-granite · $0.15/1M API est.Qwen3 14B NVFP4 — 8.2B, nvidia · $0.17/1M API est.Llama 3.1 8B Instruct — 8B, unsloth · $0.16/1M API est.internlm2 5 7b chat — 7.7B, internlm · $0.16/1M API est.Apertus 8B Instruct 2509 — 8.1B, swiss-ai · $0.16/1M API est.hf moshiko — 7.8B, kmhf · $0.16/1M API est.granite 3.1 8b instruct — 8.2B, ibm-granite · $0.17/1M API est.Josiefied Qwen3 VL 4B Instruct abliterated beta v1 — 4.4B, Goekdeniz-Guelmez · $0.14/1M API est.Llama 2 7b hf — 6.7B, NousResearch · $0.15/1M API est.Llama 2 13b chat hf — 13B, meta-llama · $0.20/1M API est.OLMoE 1B 7B 0125 Instruct — 6.9B, allenai · $0.16/1M API est.OLMoE 1B 7B 0924 — 6.9B, allenai · $0.16/1M API est.LLaDA 8B Base — 8B, GSAI-ML · $0.16/1M API est.Humanish Roleplay Llama 3.1 8B — 8B, vicgalle · $0.16/1M API est.Qwen 7B Chat — 7.7B, Qwen · $0.16/1M API est.Gemma 4 12B OBLITERATED — 12B, OBLITERATUS · $0.20/1M API est.Gemma 4 E4B it NVFP4 — 6B, bg-digitalservices · $0.15/1M API est.Falcon3-10B Instruct — 10B, TII · $0.18/1M API est.MiMo 7B RL — 7.8B, XiaomiMiMo · $0.16/1M API est.SeeClick — 9.7B, cckevinn · $0.18/1M API est.LFM2.5 8B A1B — 8.5B, LiquidAI · $0.17/1M API est.Qwen3Guard Gen 8B — 8.2B, Qwen · $0.17/1M API est.gemma 2 9b — 9.2B, google · $0.17/1M API est.HarmBench Llama 2 13b cls — 13B, cais · $0.20/1M API est.Qwen2.5 7B Instruct — 7.6B, unsloth · $0.16/1M API est.Llama Guard 3 8B — 8B, meta-llama · $0.16/1M API est.internlm3 8b instruct — 8.8B, internlm · $0.17/1M API est.granite 3.0 8b instruct — 8.2B, ibm-granite · $0.17/1M API est.granite 3.3 8b instruct — 8.2B, ibm-granite · $0.17/1M API est.internlm2 chat 7b — 7.7B, internlm · $0.16/1M API est.OLMo 2 1124 7B Instruct — 7.3B, allenai · $0.16/1M API est.llava onevision qwen2 7b ov — 8B, lmms-lab · $0.16/1M API est.Qwen3.6 27B MTPLX Optimized Speed — 4.7B, Youssofal · $0.14/1M API est.gemma 7b — 8.5B, google · $0.17/1M API est.MN 12B Mag Mell R1 — 12.2B, inflatebot · $0.20/1M API est.Llama 2 7b chat hf — 6.7B, NousResearch · $0.15/1M API est.Llama 3.1 Nemotron Safety Guard 8B v3 — 8B, nvidia · $0.16/1M API est.NeuralDaredevil 8B abliterated — 8B, mlabonne · $0.16/1M API est.L3 8B Stheno v3.2 — 8B, Sao10K · $0.16/1M API est.Other sizes Flagship (80B+) · Large (34–80B) · Medium (14–34B) · Tiny (under 4B) · All models
Compare → · Cost calculator →
Open the free advisor → · Prices as of 2026-06-17. We're an honest advisor — $0 markup, your own accounts, we never resell compute. © 2026 Cynosure LLC.