Omnihuman 1.5 lipsync model

Name: Omnihuman
Version: 1.5
Creator: ByteDance

OmniHuman-1.5 drops in September 2025. It can turn one image and some audio into full video. You can also guide it with a prompt or text if you want.

The output lines up with emotions, meaning and movement. It follows voice tone, rhythm, body motion, even how people interact.

It works on all kinds of characters. Real people, cartoons, animals, or anything with a style. You can add more characters and go longer with scenes too.

The update brings in a dual-system setup. The avatar doesn’t just react anymore. It now has a fast-thinking side and a slower one that plans moves with more intent.

A built-in language model helps direct motion. The AI gives text-based guidance so movement fits what’s being said and meant. It’s not just lip-sync anymore.

It blends audio, text and images better. There’s a trick called “pseudo last frame” that keeps how a character looks while letting them move with more expression.

The new version handles longer and tougher scenes. You can run over a minute of moving shots, use more camera angles and manage multiple people talking at once.

It supports more emotional range. Characters can show things like anger or confession, not just basic talking.

It understands what’s being said and how. So motion and camera can shift depending on tone or extra input from the prompt.

Omnihuman 1.5 Examples

Omnihuman 1.5 wasn't able to detect a face here. Omnihuman 1 was able to, although failed to animate it properly Generated on September 24, 2025

Compare Models

Wow, this is really good Generated on September 24, 2025

for link to original.

Compare Models

If you'd like to access this model, you can explore the following possibilities:

Omnihuman 1.5 lipsync model

Omnihuman 1.5 Examples

Where To Find Omnihuman 1.5

Other Models by ByteDance