Guidance Scale
What is Guidance Scale?
Guidance scale is a setting that controls how closely the AI follows your text prompt: turn it up and the model sticks more rigidly to your description; turn it down and the model takes more creative liberties.
At a glance
- Also known as
- CFG scaleClassifier-free guidance scalePrompt strength (in some interfaces)
- Used for
- Controlling prompt adherence in diffusion model generationBalancing literal accuracy with aesthetic qualityTuning model behaviour for different creative goals
- Common tools
- Stable diffusionMidjourneyAUTOMATIC1111 WebUIComfyUIRunwayAny diffusion-based generation platform
- Related terms
- Diffusion modelPrompt engineeringNoise / denoisingSampling stepsLatent space
Ready to create?
Direct scenes, design characters, and ship full films
All-in-one AI creative platform with simple, transparent pricing, no speed throttles, and an infinite Canvas for max creativity.
How it compares
guidance scale controls how strongly the prompt influences each step of the denoising process, affecting adherence to content described in the text. Sampling steps controls how many denoising iterations the model performs in total, affecting the detail and coherence of the final output. Both parameters interact: more steps give guidance scale more opportunities to refine the output, but the two control fundamentally different aspects of the generation process.
Pro tip
When you cannot get a specific element from your prompt to appear in the output: a particular object, background detail, or compositional element: try increasing the guidance scale by two or three units before making other changes. If the output then looks harsh or oversaturated, you have found the upper limit for that prompt and model combination, and the issue is more likely with prompt phrasing or model capability than with the guidance setting.
Types and variations
- Different diffusion models have different effective guidance scale ranges.
- Models like Stable Diffusion 1.
- 5 typically perform well in the 7–12 range, while newer architectures such as SDXL and Flux may perform better at lower values.
- Some models use classifier-free guidance in modified forms: for example, applying it differently to image tokens versus text tokens: which can change the effective behaviour of the scale parameter even when its numerical range appears similar.
- Some platforms replace the numerical scale with descriptive presets, making guidance scale adjustment more accessible without exposing the underlying technical parameter.
Ready to make your first scene in Morphic?
Try MorphicCommon use cases
- Creators adjust guidance scale when their generated outputs are failing to include specific elements described in the prompt: raising the scale often makes these elements appear more consistently.
- Conversely, when generated images look harsh, over-saturated, or unnaturally rigid, lowering the scale often restores a more natural aesthetic quality.
- Fine-tuned or LoRA-adapted models may require lower guidance scales than base models because the fine-tuning has already specialised the model's prior toward the desired output domain, reducing the need for strong prompt steering.
Ready to create?
Direct scenes, design characters, and ship full films
All-in-one AI creative platform with simple, transparent pricing, no speed throttles, and an infinite Canvas for max creativity.
FAQs
Guidance scale is a parameter that controls how closely a diffusion model's output adheres to the text prompt. Higher values cause the model to follow the prompt more strictly; lower values give the model more creative freedom to draw on its own learned aesthetic sense, which can produce more visually natural but less literally accurate results.
CFG stands for classifier-free guidance, the technical mechanism underlying guidance scale in diffusion models. It works by amplifying the difference between the model's conditioned output (following the prompt) and its unconditioned output (generating without direction), steering the generation toward the prompted content without requiring a separate classifier model.
At very high guidance scale values, outputs tend to become over-saturated, visually harsh, and artificially sharp, with a quality sometimes described as burnt. The model overcommits to each element of the prompt independently without balancing them naturally, often producing images that feel hyperreal or plasticky rather than cohesive.
At very low values, the model largely ignores the prompt and generates images based on its own learned prior, which may be aesthetically pleasing but will not match the described content. Specific subjects, objects, or compositional elements called for in the prompt may be absent or ambiguous in the output.
A value between 7 and 12 is a reasonable starting point for most Stable Diffusion-based models, while newer architectures like Flux often perform better at lower values in the 2–5 range. The optimal value depends on the specific model, prompt complexity, and desired aesthetic, so experimentation within the effective range of the model being used is the most reliable approach.
Guidance scale applies to video diffusion models in the same way it does to image models, controlling how closely the generated video follows the text prompt at each denoising step. The interaction between guidance scale and temporal coherence in video generation can be more complex than in still image work, and different video models may have narrower effective guidance ranges.
The underlying concept is consistent across diffusion-based models, but the effective numerical range, the default value, and how the parameter is labelled varies between tools and model architectures. What reads as a high guidance scale in one model may behave differently in another, so understanding the specific behaviour of the model being used is more useful than applying a universal rule.
Guidance scale modulates how strongly the model follows the prompt but cannot compensate for a prompt that is unclear, contradictory, or outside the model's capability. If the concept described is not well represented in the model's training data, increasing guidance scale will only force a more committed but still incorrect interpretation. Improving the prompt itself is always more effective than adjusting guidance scale alone.