Question 1

What is a token in AI, and why does it matter?

Accepted Answer

A token is the basic unit of text that an AI model processes. Rather than reading raw characters or complete words, models work on token sequences produced by breaking input text into standardised units using a tokenizer. Token counts matter because they determine prompt length limits, session memory size, and API usage costs: and because a model's ability to attend to content across a very long token sequence decreases for content far from the current generation point, affecting generation quality for long or complex prompts.

Question 2

How many words is a token, roughly?

Accepted Answer

A useful rule of thumb is that one hundred tokens corresponds to approximately seventy-five words in English, meaning one word averages about one and a third tokens. Common short words like the or and are typically single tokens, while longer or rarer words may split into two or more tokens. Punctuation, spaces, and special characters also consume tokens, so actual word-to-token ratios vary with writing style, vocabulary complexity, and the specific tokenization scheme a model uses.

Question 3

What is a context window, and how does it relate to tokens?

Accepted Answer

A context window is the maximum number of tokens an AI model can process in a single session: its working memory. All input tokens (the prompt) and output tokens (the response) count toward this limit. When a conversation or prompt exceeds the context window, earlier content is truncated or down-weighted, meaning the model loses access to information it was given earlier. Context window sizes vary significantly between models, from a few thousand tokens in smaller systems to hundreds of thousands in frontier models.

Question 4

Do visual inputs like images also consume tokens?

Accepted Answer

Yes: in multimodal models that accept image inputs, images are divided into spatial patches and each patch is converted into a visual token. A typical image might generate several hundred visual tokens depending on its resolution and the model's patch size. Higher-resolution images consume more tokens, which means using high-resolution reference images in a multimodal prompt can significantly reduce the remaining token budget for text instructions. Being mindful of image resolution when using visual inputs helps manage context window usage in image-conditioned generation workflows.

Question 5

Why do AI models sometimes ignore instructions near the end of a long prompt?

Accepted Answer

Models process tokens sequentially and distribute attention across the full sequence, but this attention is not perfectly uniform. Content near the beginning of a prompt and content immediately before the generation point tend to receive the most consistent attention. Instructions buried deep in a long prompt ( many hundreds of tokens from the start ) are at greater risk of being under-weighted, particularly if the prompt is approaching the model's context window limit. Placing the most critical creative instructions early in the prompt and keeping prompts concise reduces this effect.

Question 6

What is the difference between input tokens and output tokens?

Accepted Answer

Input tokens are the tokens that make up the prompt submitted to the model: all the text, image patches, or other content provided by the user. Output tokens are the tokens the model generates as its response. In commercial AI APIs, these are typically priced differently because generating output tokens requires running the full model forward pass for each token produced, which is computationally more intensive than processing input tokens. For generation tasks with long outputs: such as generating a full script or a lengthy creative treatment: output token costs can exceed input token costs significantly.

Question 7

How should I think about tokens when writing video generation prompts?

Accepted Answer

For video and image generation prompts, token awareness means leading with the most important creative and compositional decisions ( subject framing, camera movement, visual style, lighting ) before adding secondary details. Models attend most consistently to early tokens, so burying the key instruction in the middle or end of a dense paragraph risks inconsistent execution. Aim for concise, precise prompts that front-load creative specifics and avoid redundant phrasing that consumes tokens without adding new information. Shorter, well-structured prompts often outperform longer, more exhaustive ones for this reason.

Question 8

Are tokens the same as model parameters?

Accepted Answer

No: tokens and parameters describe entirely different aspects of an AI model. Tokens are the units of text or visual input that a model processes at inference time; they describe what goes into and comes out of the model during use. Parameters are the learned numerical weights stored within the model's neural network that encode its knowledge and capabilities; they describe what the model knows and how it processes information. A model with more parameters has more learned capacity, while a model with a larger token context window can process more information at once: these are independent properties that vary separately across different models.

Token

What is Token?

Direct scenes, design characters, and ship full films

Types and variations

Ready to make your first scene in Morphic?

Common use cases

Direct scenes, design characters, and ship full films

FAQs