Reference Image
What is Reference Image?
A Reference Image is an image you provide to an AI model to guide what it generates: showing it the style, character, composition, or visual quality you want, rather than only describing it in words.
At a glance
- Also known as
- Image referenceVisual referenceImage promptStyle reference
- Used for
- Communicating visual style, colour, and aesthetic qualities that are difficult to describe in textAnchoring character or object appearance for consistency across multiple generationsGuiding composition, structure, or spatial arrangement in generated imagesEncoding a defined visual identity or aesthetic language for a production
- Common tools
- IP-adapterControlNetImage-to-image generationMidjourney style reference (--sref parameter)Morphic reference image features
Ready to create?
Direct scenes, design characters, and ship full films
All-in-one AI creative platform with simple, transparent pricing, no speed throttles, and an infinite Canvas for max creativity.
How it compares
Compared with related concepts
Reference images and text prompts are complementary rather than competing forms of generation guidance. Text prompts excel at specifying subject content, actions, narrative context, and concepts that can be described in words. Reference images excel at communicating visual qualities that are difficult to articulate: specific colour harmonies, texture qualities, gestural styles, spatial arrangements, and character or object appearances. The most powerful generation workflows combine both: text prompts provide content and context guidance while reference images provide visual quality and consistency anchoring. Neither alone achieves what both together can.
Think of it like…
Providing a reference image to an AI generation model is like handing a brief to a designer alongside a mood board: the text describes what you want in words, but the images show what you mean visually, conveying nuances of tone, style, and aesthetic sensibility that no amount of written description could fully capture.
Pro tip
Invest time in curating high-quality, clear, and well-chosen reference images rather than using whatever is readily available. A reference image that clearly shows the specific quality you want to extract: a clean, well-lit character portrait for character consistency, a single strong image representing the colour palette for style guidance: produces better conditioning than a cluttered or ambiguous reference. The model can only extract what is clearly present in the reference, so the clarity and specificity of the reference directly determines the precision of the conditioning it provides.
Types and variations
- Style reference images guide the overall aesthetic, colour palette, and visual character of the generation without constraining subject or composition.
- Character reference images anchor a specific person or character's appearance for consistency across multiple generations.
- Composition reference images guide the spatial arrangement, framing, and compositional structure of the output.
- Pose reference images (used with ControlNet pose conditioning) provide a specific body position for a character to adopt.
- Colour reference images guide the colour palette and tonal relationships of the generation without constraining style.
- Mood board references provide a collection of images that collectively define an aesthetic direction for a production.
Ready to make your first scene in Morphic?
Try MorphicCommon use cases
Reference images are used in commercial production to maintain brand and product visual consistency across AI-generated imagery, in character-driven AI video to preserve character appearance across shots and scenes, in art direction workflows where a defined visual identity must be communicated to the generation model, in style transfer applications where the aesthetic of a specific artwork or photograph must be replicated in new content, in fashion and product visualisation where specific garment or product appearances must be accurately reproduced, and in any AI generation context where visual specificity beyond what text prompting can reliably achieve is required.
Ready to create?
Direct scenes, design characters, and ship full films
All-in-one AI creative platform with simple, transparent pricing, no speed throttles, and an infinite Canvas for max creativity.
FAQs
A reference image is a visual input provided to an AI generation model to guide aspects of the generated output: style, character appearance, composition, colour palette, or other visual qualities. It communicates visual information that text prompts cannot fully specify, providing a direct visual anchor for the model to extract and apply to its generation.
IP-Adapter encodes the overall visual features of a reference image ( aesthetic qualities, colour relationships, visual style ) and uses them to influence the generation without requiring spatial alignment between reference and output. ControlNet extracts specific structural information ( pose, edges, depth ) from a reference and uses it to constrain the spatial arrangement of the generated output while allowing visual re-styling. IP-Adapter guides aesthetic; ControlNet guides structure.
Any image can in principle serve as a reference, but the quality and clarity of the reference directly affects the quality and precision of the conditioning. Clear, unambiguous images that prominently feature the specific quality you want to extract: the character face for character consistency, the distinctive colour palette for style guidance, the specific pose for pose conditioning: produce better conditioning results than cluttered, ambiguous, or low-quality references. Choose references that clearly and unambiguously show what you want the model to pick up.
Character reference images provide the model with a specific visual specification of a character's appearance ( their face, proportions, hair, and distinctive features ) that text description alone cannot precisely anchor. By conditioning each generation on the same character reference through IP-Adapter or platform-specific consistency features, the model produces outputs that reflect the reference character's appearance rather than generating a new variation of the described type for each output.
A style reference image guides the overall aesthetic, colour palette, tone, and visual character of the generation: communicating a desired look and feel rather than specific subject content. It tells the model how to render the scene, not what to render. Style references are particularly effective for establishing consistent visual identity across a body of generated work and for communicating aesthetic directions that are difficult to fully specify in text.
A mood board is a curated collection of reference images that collectively define the visual direction, aesthetic sensibility, and tonal character for a project or production. In AI generation, mood board images serve as style references that guide the overall visual identity of generated content. Some platforms support multiple reference images simultaneously; others require selecting the single most representative reference. A well-curated mood board distils complex aesthetic vision into concrete visual examples the model can respond to.
The balance between reference image conditioning and text prompt influence depends on the technical approach used and its strength settings. Strong reference conditioning (high IP-Adapter weight, strong ControlNet guidance) can dominate the generation, with text prompt guidance playing a secondary role. Lighter conditioning allows more text prompt influence. In practice, the most effective approach is to set conditioning strength so that both reference and text contribute meaningfully: the reference anchoring the visual quality or structure while the text prompt guides content and context.
The legal status of using copyrighted images as references in AI generation is an area of active legal development and genuine uncertainty. Providing a reference image to condition generation is technically distinct from reproducing the image, but the outputs may reflect the style or visual character of the reference in ways that could be considered legally relevant, depending on jurisdiction and specific circumstances. When in doubt about commercial use of reference-conditioned generations, consult relevant legal guidance and consider using original, owned, or licence-cleared images as references.