Video generation

Happy Horse 1.1

by Alibaba

Alibaba's video model.
Synchronized audio and native lip‑sync, generated in a single pass.

Happy Horse 1.1

Key features

Technical specifications

1080p

Render at 1080p for delivery, or 720p to draft faster.

3–15s

Each clip runs 3 to 15 seconds, with a 5-second default.

7

Native lip-sync in seven languages, matched to each one's phonetics.

Up to 9

Bring up to nine subjects, each called by index in the prompt.

Use cases

Dialogue-driven scenes

Characters speak in any of 7 languages with synced lip movement, ambient sound, and timing, generated together in one pass.

Multi-character storytelling

Hold up to nine subjects from reference images and carry them across scenes, calling each by index for consistent ensemble work.

Ad and campaign spots

Reference-driven control keeps product, talent, and brand visuals consistent across shots, with audio and motion in sync.

Music videos and performance

Video and audio generated together means motion lands on beat from the first pass, with no manual sync work afterward.

Ultrawide and vertical

Deliver the same scene as a 21:9 cinematic cut and a 9:16 vertical from nine aspect ratios, no separate workflow per format.

Multilingual localization

Same scene, same characters, dialogue swapped across languages with native lip-sync, suited for global campaigns.

Prompt examples

Dialogue scene

Two friends laughing in a Paris café, French dialogue, handheld

Edit prompt

News anchor

A news anchor reads the evening headline, synced studio audio

Edit prompt

Performance clip

Cellist on a rooftop at sunset, sweeping orchestral score

Edit prompt

Product spot

Sneakers spin on a glossy floor, hip-hop beat, macro lens

Edit prompt

Ensemble scene

Three friends toast at a rooftop dinner, clinking glasses, laughter

Edit prompt

Ultrawide cinematic

Lone hiker on a ridge at dawn, 21:9, wind and birdsong

Edit prompt

Simple pricing

Get started for free today, with the option to upgrade or cancel anytime.

Basic

$0/ month
billed as $0 per year

900 monthly credits

1 user only

All models

Workflows

Standard

$0/ month
billed as $0 per year

3200 monthly credits

1 user only

All models

Workflows

Pro

$0/ month
billed as $0 per year

6200 shared monthly credits

1 user

+ up to 4 more at extra cost

All models

Workflows

Pro Max

$0/ month
billed as $0 per year

24000 shared monthly credits

1 user

+ up to 9 more at extra cost

All models

Workflows

Enterprise

For higher limits

Custom

pricing and billing terms

Unlimited credits
Custom seat limits
All models
Workflows
Pricing Gradient

Free

For playing around

$0

forever free

Up to 20 credits
1 user only
Limited models
Workflows

FAQs

What is Happy Horse 1.1?
Happy Horse 1.1 is Alibaba's video generation model, served on fal and available on Morphic. It generates video and synchronized audio together in a single pass, with native lip-sync across seven languages. It runs text-to-video, image-to-video, and reference-to-video, and outputs 1080p clips of 3 to 15 seconds in nine aspect ratios.
What is Happy Horse 1.1 best for?
Happy Horse 1.1 is strong for dialogue and performance scenes, since it generates synchronized audio and native lip-sync in a single pass. Reference-to-video for up to nine subjects suits multi-character and ensemble work, and nine aspect ratios cover cinematic 21:9, vertical 9:16, and square delivery.
Does Happy Horse 1.1 generate audio and lip-sync?
Yes. Happy Horse 1.1 generates video and audio together in a single pass, so dialogue, sound effects, ambience, and music stay in sync with the motion, with no separate audio step. It provides native lip-sync across seven languages: English, Mandarin, Cantonese, Japanese, Korean, German, and French, with mouth shapes matched to each spoken language.
How does reference-to-video work in Happy Horse 1.1?
Pass up to nine reference images and refer to each by index in the prompt, as character1 through character9 matching the order you supply them. Happy Horse 1.1 carries each subject into the new scene so a cast stays recognizable across shots. Name which subject comes from which image, then describe the scene and the action.
What resolution, duration, and aspect ratios does Happy Horse 1.1 support?
Happy Horse 1.1 outputs 720p or 1080p, in clips of 3 to 15 seconds with a 5-second default. It supports nine aspect ratios: 16:9, 9:16, 1:1, 4:3, 3:4, 21:9, 9:21, 5:4, and 4:5. Prompts can run up to 2,500 characters.
How do I use Happy Horse 1.1 on Morphic?
Open Morphic, switch the prompt bar to Video mode, and pick Happy Horse 1.1 from the model picker. Describe your scene, optionally attach a still for image-to-video or up to nine reference images for reference-to-video, then choose a resolution and aspect ratio and run the prompt. Audio is generated in the same pass.