The chunks of text (roughly word-pieces) that a model reads and writes; pricing, speed, and context limits are all measured in tokens, not words.
Language models don't process raw characters or whole words — they break text into tokens, which are common character sequences. A rough English rule of thumb is ~1 token ≈ 0.75 words, or about 4 characters per token, though it varies by language and by the specific tokenizer.
Tokens are the unit everything is counted in. API prices are quoted per million tokens (often split into cheaper input and pricier output tokens), generation speed is reported in tokens per second, and a model's context window is a token limit. Both your prompt and the model's reply consume tokens.
Because counts are in tokens, a 1,000-word document is very roughly ~1,300 tokens — useful for estimating cost and whether something fits in the context window.
Context window · Inference · Local vs API vs renting a GPU · Embeddings
All explainers → · Browse models →
Open the free Spanvero advisor → · Honest, $0-markup. © 2026 Cynosure LLC.