This guide walks every decision in making a high-quality AI microdrama, in the order you'd make them. By the end you'll know how to write an episode that earns the next swipe, lock characters and locations against drift, generate scenes that hold together, and finish with clean sound and colour.
Most AI microdramas fail in three places: character drift between shots, location drift between scenes, and dialogue that looks stitched together. Each one is a decision you make before generating a frame; each step below makes one.
Microdrama beats you can make on Morphic
What you'll need before you start
A microdrama is a vertical, episodic short-form story, usually 60 to 90 seconds, built to keep a viewer watching through the next cliffhanger. It runs on TikTok, Reels, YouTube Shorts, and dedicated apps like ReelShort and DramaBox.
Before you open a workflow, have three things in hand: a one-sentence premise ("a woman opens a letter from a father she thought was dead"), the recurring characters (lead, antagonist, anyone who appears more than once), and one to three locations. Everything below builds on those three. If you don't have them yet, jot them down before reading on.
How a microdrama gets made on Morphic
1.
Write the script
A 75-second episode is one beat with a hook at the start and a cliffhanger at the end. The shape that works:
- 0 to 10 seconds: hook. Re-establish the situation in one image and one line.
- 10 to 55 seconds: escalation. One thing changes, stakes go up.
- 55 to 75 seconds: cliffhanger. Cut on a reveal, never on a resolution.
Keep scene changes to one every twenty seconds. Each new location is another chance for the AI to drift, so fewer cuts means a tighter episode.
Your script is the source for everything that follows:
- Every beat becomes a row in the shot table in Step 2.
- Every recurring character it names becomes a Character Reference Sheet in Step 3.
- Every set a scene moves through becomes a Location Reference Sheet in Step 4.
You don't write the references from scratch; you pull them from the script.
2.
Plan the shots
Don't go from script straight to generation. The bridge is a shot table: every shot listed by number, framing, action, dialogue, and duration. It turns "they argue in the kitchen" into five specific frames the AI can generate.
Plan anchor shots first: the three to five keyframes that carry the scene's emotional weight (the close-up on the reveal, the wide that shows what changed, the reaction shot). Fill in medium and wide shots between them. Shots run 3 to 6 seconds. Here's a finished shot table for a 32-second letter-opening scene:
| # | Time | Framing | Action | Dialogue |
|---|---|---|---|---|
| 1 | 0–4s | Wide, low angle | Mara walks into the kitchen, drops her bag on the counter | Silent |
| 2 | 4–9s | Medium, over-shoulder | She picks up the unopened envelope from the table | What is this? |
| 3 | 9–13s | Close-up, hands | Her fingers tear the seal slowly | Silent |
| 4 | 13–19s | Close-up, face | She reads. Her eyes change. | Silent |
| 5 | 19–22s | Insert, the letter | The signature is visible. One word: Dad. | Silent |
| 6 | 22–27s | Medium, frontal | She lowers the letter. Tears, but no sound. | He's alive. |
| 7 | 27–32s | Wide, pull back | She stands alone in the kitchen, frozen | Silent |
3.
Create character reference sheet
Lock each recurring character's look before you generate a scene. Pick two or three anchor attributes per character: a signature garment, a hairstyle, one accessory. Over-specifying backfires; a paragraph on bone structure and gait produces a different character every shot because the AI can't hold every detail equally.
The Character Reference Sheet workflow does the locking in one run. For per-scene expression variants, the Expressions workflow keeps the lock intact.

Character Reference Sheet
Build a reusable visual reference for every recurring character so the lead looks the same in every shot for the rest of the season.
Try this workflow4.
Create location reference sheet
Same exercise for every location that appears more than once. The trick is the immutable detail: one object in the set that cannot change between shots (a red brick wall, a brass light fixture, a cracked tile pattern). Repeat it verbatim in every prompt for that location, and the place stays itself across the season.
The Location Reference Sheet workflow builds the sheet for you. Three or four sheets in, you have a production library that costs nothing per future episode.
Location Reference Sheet
Build a reusable visual reference for every recurring set so the place stays itself across the whole season.
Try this workflow5.
Storyboard the scenes
Feed your locked shot table plus the character and location sheets into the Cinematic Storyboarding workflow. You get a reference panel for every row, ready to drive the next step. Storyboarding by hand is the bottleneck this removes.
Cinematic Storyboarding
Turn a scene description into a full shot table with reference panels for every row.
Try this workflow6.
Generate the video
Step 5 gave you a panel for every shot. The Image to Video tool brings each one to motion: feed in a panel, prompt the camera move and character action, get a video clip.
Two ways to handle audio. The fast path bundles dialogue cues, music mood, and ambient sound into the same prompt and returns video with audio embedded in one pass. The control path generates clean video first and layers dialogue, score, and sound design separately in the next two steps. Hero dialogue beats earn the control path; cutaways and B-roll don't.
7.
Add dialogue and lip sync
Dialogue is where most AI microdramas show their seams. Generate dialogue on Seedance 2.0 when you can; it produces lip sync in the same pass as the video. On other models, the Lip Sync tool retrofits it after. When neither is an option, hide the gap:
- Cut tighter. A two-second shot is forgiving where a six-second one is not.
- Angle away from mouths during longer lines.
- Layer ambient sound under every dialogue scene.
The ear forgives a lot when there's texture under the voice.
8.
Finish and export
Three layers separate a rough cut from a finished episode:
- Audio: Speech for voiceover, Music for score, Sound Effects for atmosphere. One sustained pad under dialogue and one motif on the cliffhanger is plenty.
- Colour: every scene shares a palette and a key-light direction. Within one episode this is non-negotiable; across episodes you have more latitude.
- Export: Upscale in Canvas, then export at 9:16, 1080×1920 or higher. Vertical platforms compress aggressively, and an oversampled render survives the squeeze.
Made with Morphic
Built with the same workflows that drive a microdrama production.
What makes a microdrama feel high-quality
| Quality | What it means | Why it matters |
|---|---|---|
| Character consistency | The lead's face, build, and signature attributes are identical in every shot they appear in | Drift between shots is the fastest way to break the illusion that this is a real show. |
| Location consistency | The same set looks like the same set across scenes and episodes, lighting and palette included | Viewers track environments unconsciously. A coffee shop that changes between cuts reads as a different coffee shop. |
| Continuity across cuts | Camera position, character placement, and background carry forward shot to shot | A scene shot well stops feeling like a sequence of AI clips and starts feeling like a directed scene. |
| Audio-visual alignment | Dialogue lands on the beat, ambient sound carries every scene, music swells on the cliffhanger | Sound is the layer that quietly separates rough cut from finished episode. |
Common mistakes to avoid
| Mistake | What goes wrong | The fix |
|---|---|---|
| Over-specifying characters | A four-paragraph brief produces a different character every shot because the AI can't hold every detail equally | Two or three anchor attributes, repeated verbatim in every prompt |
| Skipping the location sheet | The set drifts between scenes because nothing locks it in place | Build a Location Reference Sheet for any set that appears more than once |
| Monologue shots over five seconds | Lip-sync mismatches and proportion drift have more time to surface | Cut tighter, angle away from mouths, or use Seedance 2.0 for native lip sync |
| Lighting direction flips | Key light from camera-left in shot 1 and camera-right in shot 2 reads as a different scene | Plan key-light direction in the shot table and hold it across every shot in the scene |
| No ambient sound under dialogue | Voice-only scenes amplify every small timing or generation artefact | Layer a low ambient bed under every dialogue scene with the Sound Effects tool |
FAQs
Lock the character once with the Character Reference Sheet workflow, then reference the locked sheet in every shot. Use two or three anchor attributes (a signature garment, a hairstyle, one accessory) and repeat them verbatim in every prompt. Over-detailed briefs produce more drift, not less.
Sixty to ninety seconds is the working range for vertical social platforms. Shorter than sixty seconds rarely earns a cliffhanger. Longer than ninety seconds loses retention before the payoff lands.
Yes. Seedance 2.0 handles dialogue and lip sync in a single generation pass. For models without native lip sync, retrofit with the Lip Sync tool, or cut tighter and angle away from mouths during longer lines.
Build a shot table: a row-by-row list of every shot with framing, action, dialogue, and duration. Start by generating the three to five anchor shots that carry the scene's emotional weight, then fill in medium and wide shots between them. The Cinematic Storyboarding workflow turns a finished shot table into reference panels automatically.
Seedance 2.0 is the default for any episode with dialogue, because it generates lip sync in a single pass. For non-dialogue scenes, any of the cinematic models on Morphic will produce strong results when you pass in locked character and location references.

