Multi-modal
Available now

Grok Imagine

by xAI

xAI's cross-modal model. Text-to-image, image edits, text-to-video, image-to-video, and video-to-video in one consistent visual system.

Text-to-imageImage-to-imageText-to-videoImage-to-videoVideo-to-video

Grok Imagine

by xAI

Key features

What makes Grok Imagine stand out from other AI models

Technical specifications

Key specs and capabilities at a glance

1080p

Full HD output

6–10s

6–10 seconds for video

5

TTI, ITI, TTV, ITV, VTV

24 fps

Standard frame rate

Use cases

How creators and businesses use Grok Imagine on Morphic

Multi-format campaigns

Generate matching images and videos from one concept. Design a still hero, then animate it to video with consistent visual style.

Video transformation

Restyle existing footage with AI. Transform the look and feel of video clips while preserving the original motion, timing, and structure.

Image-to-video pipeline

Design a still, refine it with image editing, then animate it to video. The cross-modal flow keeps creative control precise.

Creative exploration

Experiment across image and video without switching models. Explore ideas in stills and motion from one creative brief.

Style transfer

Apply new styles to existing images and videos. Turn photos into paintings, footage into animation, or restyle for new audiences.

Content repurposing

Transform existing assets across formats. Turn product photos into promo videos, animate illustrations, or restyle existing clips.

Prompt examples

Open any of these to tweak and generate

Image generation

A cyberpunk city at night with holographic advertisements floating between buildings, reflections on wet streets, detailed neon colors, cinematic wide angle

Edit prompt

Video from image

Upload a landscape photo and describe: Bring this scene to life with gentle wind through trees, moving clouds, and soft light changes

Edit prompt

Video transformation

Upload a video clip and describe: Transform to watercolor painting style, maintain all original motion, soft pastel colors

Edit prompt

Simple pricing

Get started for free today, with the option to upgrade or cancel anytime.

Basic

$0/ month
billed as $0 per year

500 monthly credits

1 user only

All models

Workflows

Standard

$0/ month
billed as $0 per year

2800 monthly credits

1 user only

All models

Workflows

Pro

$0/ month
billed as $0 per year

6000 shared monthly credits

1 user

+ up to 4 more at extra cost

All models

Workflows

Pro Max

$0/ month
billed as $0 per year

24000 shared monthly credits

1 user

+ up to 9 more at extra cost

All models

Workflows

Enterprise

For higher limits

Custom

pricing and billing terms

Unlimited credits
Custom seat limits
All models
Workflows
Pricing Gradient

Free

For playing around

$0

forever free

Up to 20 credits
1 user only
Limited models
Workflows

FAQs

What is Grok Imagine?
Grok Imagine is xAI's multimodal AI model that generates both images and videos. It supports five modes: text-to-image, image-to-image editing, text-to-video, image-to-video animation, and video-to-video transformation.
Can Grok Imagine create both images and videos?
Yes. Grok Imagine is one of the few models that spans both image and video generation from a single model, making it uniquely versatile for creators who work across formats.
What is video-to-video?
Video-to-video lets you transform existing video clips, change visual styles, edit aesthetics, or reimagine footage while preserving the original motion, structure, and timing.
How long are Grok Imagine videos?
Grok Imagine generates videos between 6 and 10 seconds at 1080p, suitable for social media clips, creative shorts, and promotional content.
What makes Grok Imagine unique?
Its cross-modal versatility. While most models specialize in either image or video, Grok Imagine excels at both plus offers video-to-video transformation, five modes from a single model.
How do I use Grok Imagine on Morphic?
Open Copilot, describe what you want, and pick Grok Imagine. You can generate stills, animate an image, or transform an existing video clip, all from one model, which keeps cross-format work in a single conversation.