Voice emotion control

Voice emotion control on Morphic lets you shape how generated speech sounds, from emotional tone to reactions, pacing, and delivery style. Write your prompt with the right cues, and the voice performs the way you direct it.

How to use Voice emotion control

Here's a quick tutorial on how to use voice emotion control on Morphic:

  1. Open Morphic and go to your project.

  2. Create a new file or open an existing one.

  3. In the prompt bar at the bottom, switch the mode to 'Audio' and select 'Speech'.

  4. Choose your audio model: 'ElevenLabs' or 'MiniMax'.

  5. Select a voice and language from the voice picker.

  6. Write your prompt using the emotion control format for your selected model (see below).

  7. Click 'Generate'.

Morphic supports two speech models. Each uses a different syntax for emotion control. Select your model, then follow the guide below.

ElevenLabs

ElevenLabs uses bracket tags written directly in your prompt. Wrap any emotion, reaction, or direction in square brackets, and the model interprets it as a performance cue, not spoken text.

How it works

[tag] Your dialogue text here.

Tags affect everything after them until a new tag appears. You can place tags anywhere in your text and combine multiple tags in sequence.

Without tags
With tags

I got the part. I actually got the part.

[excited] I got the part. I actually got the part.

We need to leave. Now.

[whispers][tense] We need to leave. Now.

I don't think this is going to work out.

[sad][hesitant] I don't think this is going to work out.

The treasure is buried beneath the old chapel.

[pirate voice] The treasure is buried beneath the old chapel.

ElevenLabs is open-ended. There is no fixed list. Write any emotion or direction inside brackets, and the model will try to interpret it. Tags like [jealous], [romantic], [awkward], [suspicious tone], or [continues after a beat] all work.

The tags below are commonly used and reliably effective, but you are not limited to them.

Tags

Emotions

Tag
What it does

[excited]

High energy, enthusiastic delivery

[happy]

Warm, upbeat tone

[cheerfully]

Light, bright delivery

[sad]

Downcast, subdued tone

[sorrowful]

Deep sadness, grief

[angry]

Sharp, forceful delivery

[nervous]

Uncertain, slightly shaky

[frustrated]

Tense, impatient tone

[calm]

Steady, relaxed delivery

[tired]

Low energy, worn out

[curious]

Inquisitive, wondering tone

[sarcastic]

Dry, ironic delivery

[playful]

Light, teasing energy

[deadpan]

Flat, emotionless delivery

Try it:

Emotional nuance

For subtler shifts in tone. These add depth to a line without overriding the entire delivery.

Tag
What it does

[hesitant]

Unsure, holding back

[relieved]

Weight lifted, tension released

[tense]

On edge, bracing for something

[warm]

Gentle, caring tone

[resigned tone]

Giving in, accepting defeat

[stammers]

Tripping over words, flustered

[regretful]

Wishing something were different

[sympathetic]

Compassionate, understanding

[reassuring]

Comforting, steady

[awe]

Struck by wonder or amazement

Try it:

Reactions

Non-verbal sounds that add realism between or within lines.

Tag
What it does

[laughs]

Full laugh

[giggles]

Soft, light laugh

[light chuckle]

Brief, subdued laugh

[sigh]

Exhale of fatigue, relief, or frustration

[gasps]

Sharp intake of breath, surprise or shock

[gulps]

Nervous swallow

[crying]

Tearful, breaking voice

[clears throat]

Quick vocal reset

Try it:

Delivery

Control how the voice physically performs the line, independent of emotion.

Tag
What it does

[whispers]

Soft, breathy, close delivery

[shouts]

Loud, projected voice

[quietly]

Low volume, restrained

[loudly]

Raised volume, forceful

[rushed]

Fast-paced, urgent rhythm

[drawn out]

Slow, stretched delivery

[dramatic tone]

Theatrical, heightened intensity

Try it:

Accents and characters

Switch the accent without changing the voice, or give the voice a character persona.

Tag
What it does

[American accent]

Standard American English

[British accent]

Standard British English

[French accent]

French-inflected English

[Southern US accent]

Southern American drawl

[Australian accent]

Australian English

[strong Russian accent]

Heavy Russian inflection

[strong X accent]

Replace X with any nationality

[pirate voice]

Gruff, seafaring character

[old man voice]

Aged, weathered delivery

[robot voice]

Mechanical, synthetic tone

[fantasy narrator]

Epic, storybook narration

[film noir narrator]

Dark, moody, cynical narration

[sarcastically]

Dry, ironic character read

Try it:

Multi-character dialogue

When writing scenes with two or more characters in one prompt, use these to shape how lines interact.

Tag
What it does

[interrupting]

Cuts in before the other line finishes

[overlapping]

Starts speaking while another voice trails

Try it:

Pauses and pacing

ElevenLabs does not support explicit pause durations. Pause length is inferred from context, tags, and punctuation.

Write this
What it does

[pause]

Dramatic silence (model decides length)

...

Hesitant, trailing pause

ALL CAPS

Emphasis on the word

New paragraph

Clear pause and intonation reset

Try it:

Tips for better results

Tip
Why it works

Match tags to the text

[crying] Don't leave me. sounds natural. Adding [crying] to a casual sentence does not. The model reads the full line for context.

Combine tags

[whispers][tense] or [hesitant][nervous] gives the model two cues to blend for more nuanced output.

Pick the right voice

A calm voice will not shout convincingly. A high-energy voice will not whisper well. Match the voice to the role.

Use Creative or Natural stability

These settings give the model more room to express tags. Robust is more consistent but less expressive.

Use punctuation as rhythm cues

Commas slow the pace. Periods create hard stops. Ellipses trail off. The model reads and responds to punctuation.

MiniMax

MiniMax uses parenthetical sound tags in your prompt and a separate emotion selector in Morphic's UI.

Emotion

Select the emotion from the dropdown when generating. This sets the overall tone of the entire output.

Emotion
Effect

Auto

Model reads the text and picks the best emotion (default)

Happy

Upbeat, positive

Sad

Downcast, melancholic

Angry

Forceful, aggressive

Fearful

Anxious, scared

Disgusted

Repulsed, averse

Surprised

Startled, astonished

Calm

Relaxed, serene

Fluent

Clean, broadcast-style — ideal for news or technical narration

Neutral

No emotional bias

Sound tags

Add non-verbal sounds directly in your prompt using parentheses. These are preset only — only the tags listed below are supported.

Tag
Tag
Tag

(laughs)

(chuckle)

(coughs)

(clear-throat)

(groans)

(breath)

(pant)

(inhale)

(exhale)

(gasps)

(sniffs)

(sighs)

(snorts)

(burps)

(lip-smacking)

(humming)

(hissing)

(emm)

(whistles)

(sneezes)

(crying)

(applause)

(yawns)

Unlike ElevenLabs, you cannot invent custom tags. Writing (nervous) or (jealous) won't work — the model will speak them as text. Use the emotion selector for emotional tone.

Pauses

Insert timed silences using <#x#> where x is seconds (0.01–99.99).

Tips

  • Use sound tags sparingly — too many can sound unnatural.

  • Set the emotion to Auto for most cases. Override manually when you need a consistent tone across long text.

  • Punctuation matters — commas and periods guide the model's pacing and intonation.

Last updated