Question 1

What is the difference between T2A and TA2A in Seed Audio 1.0?

Accepted Answer

T2A, text prompt to audio, builds everything from your description: the environment, the music, the sound effects, and each character's voice. TA2A, text prompt plus audio to audio, adds up to three reference recordings that you tag to specific characters, so those voices follow the recordings instead of a written description. Everything else about the prompt is the same.

Question 2

Can Seed Audio 1.0 clone a voice?

Accepted Answer

Yes. Beyond T2A and TA2A there is a voice cloning mode: upload one audio clip, and the cloned voice becomes available for straight text-to-speech. ByteDance documents it as a single-clip clone. If the voice needs to appear in a full scene with music, effects, and other speakers, use TA2A instead, which takes up to three reference clips and tags each to a character.

Question 3

How does timestamp control work in Seed Audio 1.0?

Accepted Answer

Put a timestamp in the form [5.5s:8.0s] at the start of a line and the model fits that line's delivery into exactly that window, adjusting pace and pauses to make it land. It is the feature that makes the model practical for dubbing, where audio has to match picture. Lines without timestamps are paced naturally.

Question 4

What languages does Seed Audio 1.0 support?

Accepted Answer

Twenty: English, Chinese, Japanese, Korean, Mexican Spanish, Castilian Spanish, Indonesian, German, Brazilian Portuguese, French, Thai, Vietnamese, Malay, Filipino, Italian, Russian, Dutch, Polish, Turkish, and Swedish. Write the prompt in the same language as the script for the most consistent result.

Question 5

Can Seed Audio 1.0 generate multiple speakers at once?

Accepted Answer

Yes. Describe each character's voice inline as you write the scene, and the model gives each speaker a distinct voice, emotion, and pacing in a single generation, along with the ambience and effects around them. In TA2A mode you can tag up to three of those characters to reference recordings.

Question 6

How long can a Seed Audio 1.0 generation be?

Accepted Answer

Up to two minutes of audio per pass, from a prompt of up to 3,000 characters. Generation is non-streaming: the model renders the complete mixed track rather than returning audio in realtime. Longer productions are built scene by scene.

Question 7

Can Seed Audio 1.0 narrate an audiobook?

Accepted Answer

It is one of the strongest fits for the model. A single prompt covers the narrator's voice, the character voices, and the sound design around them, so a scene arrives finished rather than as separate tracks to mix. Keep the same voice reference across chapters and the narrator stays consistent through the book.

Question 8

Is Seed Audio 1.0 different from ordinary text-to-speech?

Accepted Answer

Significantly. Ordinary text-to-speech picks a voice and reads text aloud. Seed Audio 1.0 moves from text-to-speech to reference-to-audio: one prompt describes the environment, the score, the effects, and every character's voice, and the model returns the whole scene mixed together. The difference in scope is an entire audio production versus only the voice.

SCENE	Include	Example
Setting	Weather, location, context, acoustics	After-school hallway, distant footsteps, reverb
Cast	What each character is doing or wearing	Shouldering a backpack, waving from the door
Effects	Music mood and genre, sound effects	Deep war drums, low brass, a locker 'clack'
Notes on voice	Gender, age, accent, emotion, tone, speed	Teenage male, American accent, bright and cocky
Exact lines	What each character says, in quotes	'Hey, Emma, you free Saturday?'

Seed Audio 1.0: the complete guide

Documentary narration

Thriller voice-over

Spice-market ambience

Thunderstorm

Orchestral cue

Lo-fi beat

Seed Audio 1.0 use cases

One-pass video audio

Narrated explainers and tutorials

Short ads and promos

Scripted dialogue and audio drama

Audiobooks and long-form narration

Frame-accurate video dubbing

How to write a Seed Audio 1.0 prompt

Controlling timing to the second

Casting voices from reference audio (TA2A)

How to use Seed Audio 1.0

FAQs

Hear Seed Audio 1.0