DALL-E 2
What is DALL-E 2?
DALL-E 2 is OpenAI's second AI image model, producing sharper, higher-resolution images than its predecessor and adding the ability to edit, extend, and create variations of existing images.
At a glance
- Type of model
- Text-to-image diffusion model with inpainting and outpainting capabilities
- Developed by
- OpenAI
- Key capability
- Generating 1024x1024 images from text prompts with improved quality, plus inpainting, outpainting, and image variation generation
- How it fits in AI workflow
- Used for text-to-image generation, image editing, content extension, and variation exploration in creative and production workflows; succeeded by DALL-E 3 for most current professional applications
Ready to create?
Direct scenes, design characters, and ship full films
All-in-one AI creative platform with simple, transparent pricing, no speed throttles, and an infinite Canvas for max creativity.
How it compares
Compared with related concepts
DALL-E 2 vs Stable Diffusion 1. x: Both were released in 2022 and represent roughly contemporary capabilities in text-to-image generation. DALL-E 2 is proprietary, requires API access, and includes built-in safety filters with no local deployment option. Stable Diffusion is open-source, can be run locally, and supports extensive community customization through fine-tuning and extensions, but requires more technical setup. DALL-E 2 prioritizes safety and accessibility; Stable Diffusion prioritizes openness and flexibility.
Pro tip
DALL-E 2's inpainting and outpainting capabilities remain useful for specific editing tasks even as newer generation models surpass it in raw image quality. When you need to extend an existing image or replace a specific region with AI-generated content that matches the surrounding style, these editing modes can be more controllable than attempting the same task through prompt engineering alone in a generation-only workflow.
Types and variations
- Text-to-image generation produces new images from written prompts.
- Inpainting selects a masked region of an existing image and generates new content to fill it based on a text description.
- Outpainting extends the image beyond its original edges, generating coherent new content that matches the surrounding style and context.
- Image variations generate alternative versions of an uploaded image in the style of the original without a text prompt.
- Each mode uses the same underlying model but with different conditioning inputs and generation objectives.
Ready to make your first scene in Morphic?
Try MorphicCommon use cases
- Generating concept images for design projects, marketing campaigns, and content creation workflows.
- Using inpainting to remove unwanted elements from photographs or replace them with AI-generated alternatives.
- Extending illustrations or photographs beyond their original borders using outpainting to create wider compositions.
- Generating style-consistent variations of existing imagery for A/B testing or creative exploration.
- Integrating with development workflows via OpenAI's API to embed image generation capability in custom applications.
Ready to create?
Direct scenes, design characters, and ship full films
All-in-one AI creative platform with simple, transparent pricing, no speed throttles, and an infinite Canvas for max creativity.
FAQs
DALL-E 2 is OpenAI's second-generation text-to-image model, released in April 2022. It produces higher-resolution images than the original DALL-E using a diffusion-based architecture and adds inpainting, outpainting, and image variation capabilities.
DALL-E 2 switched from a transformer autoregressive architecture to a diffusion model, producing sharper images at higher resolution. It also added image editing capabilities including inpainting and outpainting that the original did not offer.
Inpainting allows users to select a region within an existing image, then describe what should replace that region in text. The model generates new content to fill the selected area while matching the surrounding style and context of the image.
Outpainting extends an existing image beyond its original canvas boundaries, generating new content that continues the style, lighting, and visual context of the original image into the expanded area.
DALL-E 2 generates images at 1024x1024 pixels as its maximum resolution, a significant improvement over the original DALL-E which produced lower-resolution outputs.
DALL-E 2 has been largely superseded by DALL-E 3 for most generation tasks, as DALL-E 3 offers significantly better prompt adherence and image quality. However, DALL-E 2's inpainting and outpainting capabilities may still be accessed for specific editing workflows.
DALL-E 2 uses a CLIP-based text-image alignment system that connects language understanding to visual content generation. It handles a wide range of prompt types but shows less precise prompt adherence than DALL-E 3, particularly for complex compositional instructions.
DALL-E 2 includes content filters that prevent generation of harmful, explicit, or infringing content. It restricts generation of real people's faces in certain contexts and applies filters designed to prevent misuse, with these safeguards enforced at the API level.