Gemini 3.1 Flash TTS
by Google DeepMind
Google's most expressive text-to-speech, with audio tags and multi-speaker dialogue.

Key features
Technical specifications
Multilingual
Style, pace, and accent control across many languages
Up to 2
Two distinct voices in one multi-speaker generation
Audio tags
Natural-language notes plus inline bracket cues
SynthID
Imperceptible AI-provenance watermark on output
Use cases
Video narration and voice-over
Add natural narration to AI or live-action video, with the tone and pacing set in plain language.
Character dialogue
Voice two-speaker scenes for shorts, games, and explainers, each character with its own voice.
Localized voice-over
Narrate the same script across many languages with native pacing and accent control.
Audiobook and long-form
Keep delivery natural and consistent across long passages of narration.
Explainers and tutorials
Clear, directable narration for product walkthroughs, lessons, and how-tos.
Ad reads and promos
Expressive, on-brand voice reads with the energy and emphasis you direct.
Prompt examples
Warm narration
Say this warmly and slowly, like comforting a child: The storm has passed. You're safe now.
Edit promptSimple pricing
Get started for free today, with the option to upgrade or cancel anytime.
Basic
500 monthly credits
1 user only
All models
Workflows
Standard
2800 monthly credits
1 user only
All models
Workflows
Pro
6000 shared monthly credits
1 user
All models
Workflows
Pro Max
24000 shared monthly credits
1 user
All models
Workflows
Enterprise
For higher limits
Custom
pricing and billing terms

Free
For playing around
$0
forever free
FAQs
Other models
Explore the rest of the Morphic model catalog.
Ideogram 4.0
Ideogram
Ideogram's open-weight image model. Frontier in-image text, layout control, and 2K output.
Reve 2.0
Reve AI
Reve AI's layout-first image model. Place every element by hand, edit the result like a design file, and render crisp text at up to 4K.
Bernini
ByteDance
ByteDance's open-source video model for instruction-based editing, with the rest of the frame locked and subject identity held.
Grok Imagine v1.5
xAI
xAI's image-to-video model with native synchronized audio. Animate any still into a clip with sound, dialogue, and music.