Gemini Omni
by Google DeepMind
Google's first any-to-any AI model. Text, images, audio, and video in, a single video out.

Key features
Technical specifications
Omni Flash
First model in Google's Gemini Omni family
Video
Image and audio output planned in the Gemini Omni roadmap
Up to 10s
Flash clips capped at 10 seconds at launch to widen access
Any mix
Text, image, audio, and video in one prompt
Voice references
Voice samples supported first; full audio inputs coming later
SynthID
Imperceptible AI-provenance watermark on every clip
May 19, 2026
Announced at Google I/O 2026
Google DeepMind
Successor positioning to Veo for any-to-any video creation
Use cases
Multi-input storyboarding
A character image, location photo, music cue, and beat go in; the model builds the shot and iterates.
Conversational video editing
Edit any clip in plain language: swap wardrobe, change a background, or retime a beat. The rest stays steady.
Marketing video
Ad cuts that respect brand colors, product shape, and on-screen text. One photo, one brief, one finished spot.
Educational explainers
Visualize science, history, and engineering with built-in physics. The science stays honest, the footage clean.
Spokesperson video
A portrait plus a voice reference gives the same on-camera presenter across shorts, courses, and walkthroughs.
Social shorts
10-second clips fit YouTube Shorts, Reels, and TikTok. Generate variations, then publish the one that lands.
Prompt examples


Product launch
Avant-garde sneaker mid-air over a titanium plinth, hard key light, launch mood
Edit prompt
Nature explainer
Droplet frozen as a crystalline crown on a dewy leaf, backlit sunrise macro
Edit prompt
Avatar spokesperson
Poised studio host addressing the lens, warm three-point light, 85mm bokeh
Edit prompt
Architectural walkthrough
Golden-hour light through a brutalist concrete villa, long shadows, drifting dust
Edit prompt
Simple pricing
Get started for free today, with the option to upgrade or cancel anytime.
Basic
500 monthly credits
1 user only
All models
Workflows
Standard
2800 monthly credits
1 user only
All models
Workflows
Pro
6000 shared monthly credits
1 user
All models
Workflows
Pro Max
24000 shared monthly credits
1 user
All models
Workflows
Enterprise
For higher limits
Custom
pricing and billing terms

Free
For playing around
$0
forever free
FAQs
Reve 2.0
Reve AI
Reve AI's layout-first image model. Place every element by hand, edit the result like a design file, and render crisp text at up to 4K.
Bernini
ByteDance
ByteDance's open-source video model for instruction-based editing, with the rest of the frame locked and subject identity held.
Grok Imagine v1.5
xAI
xAI's image-to-video model with native synchronized audio. Animate any still into a clip with sound, dialogue, and music.
Veo 4
Google DeepMind
Google DeepMind's next video model. Native 4K, longer clips, multi-shot character consistency, and a cinematic camera language in a single prompt.