Question 1

What is image-to-image AI generation?

Accepted Answer

Image-to-image is a generation workflow in which an existing image serves as the input alongside a text prompt, with the model transforming the source while preserving aspects of its composition or structure. It differs from text-to-image generation, which builds entirely from a written description without a visual starting point.

Question 2

What is denoising strength in image-to-image?

Accepted Answer

Denoising strength controls how much the model transforms the source image. At low values (near 0), the output closely resembles the source with minimal changes. At high values (near 1), the source provides only a rough structural suggestion and the model applies a substantial transformation. The optimal value depends on how much of the original's composition should be preserved versus reimagined.

Question 3

How is image-to-image different from text-to-image?

Accepted Answer

Text-to-image generates an image entirely from a written description, starting from random noise with no visual starting point. Image-to-image uses an existing image as a partial initialisation: starting the denoising process with a visual structure already in place: and the text prompt guides how that structure is transformed rather than describing the full composition from scratch.

Question 4

What is img2img?

Accepted Answer

Img2img is the common abbreviation for image-to-image, widely used within the Stable Diffusion community and in tool interfaces. The terms are used interchangeably and refer to the same generation approach in which an existing image is used as input alongside a text prompt to guide transformation.

Question 5

Can I use image-to-image to change the style of a photograph?

Accepted Answer

Yes. Applying an artistic style to a photograph while preserving its composition is one of the most common uses of image-to-image generation. By setting a moderate denoising strength and including a style-describing prompt, the model can transform the photograph's visual treatment while retaining its subjects, framing, and spatial relationships.

Question 6

What is ControlNet and how does it relate to image-to-image?

Accepted Answer

ControlNet is a conditional control system for diffusion models that uses extracted structural information from a source image ( such as edge maps, depth maps, or pose skeletons ) as precise conditioning rather than direct pixel initialisation. It is a more advanced form of image-based conditioning that allows specific structural qualities to be preserved much more reliably than standard img2img, and is widely used for character pose control, architectural layout matching, and other cases where precise structural adherence is critical.

Question 7

What is the difference between image-to-image and inpainting?

Accepted Answer

Image-to-image applies a transformation to the whole image or a substantial portion of it, guided by the visual structure of the source. Inpainting applies generation specifically to a masked region, leaving unmasked areas unchanged. For correcting or replacing specific small areas of an otherwise acceptable image, inpainting is more precise; for applying a wholesale stylistic transformation to the full composition, image-to-image is the more appropriate approach.

Question 8

What inputs does image-to-image require?

Accepted Answer

Standard image-to-image requires the source image, a text prompt describing the desired output, and a denoising strength value. Some workflows add additional conditioning such as negative prompts to exclude unwanted elements, seed values for reproducibility, and model-specific parameters. More advanced workflows using ControlNet also require specifying which type of structural conditioning to extract from the source image.

Image-to-Image

What is Image-to-Image?

Direct scenes, design characters, and ship full films

Types and variations

Ready to make your first scene in Morphic?

Common use cases

Direct scenes, design characters, and ship full films

FAQs