This is an AI video generation comparison for
image-to-video
prompt:
A smoky backroom where four capybaras dressed in 1940s gangster attire sit around a poker table under a brass hanging lamp. Cigars smolder between their teeth, poker chips clatter, and a portrait of a glamorous capybara in a silky dress hangs slightly askew on the dark wooden wall. The camera cuts to a close-up of one capybara with slicked fur and a thick cigar — his eyes narrowing with suspicion as he studies the cards and his opponents through the haze. The camera lingers on the glowing cigar...
Log in to see full prompt.
Tested: October 13, 2025
Grok Imagine v0.9 can do multishots nicely. Occasionally though it will simply zoom in and out, but most of the time clean jump cuts work.
Tested: October 13, 2025
Seedance is predictably great with multi-shot prompts. + Doesn't even need additional tips not to put cigars into the mouths of those that don't have them in the start image.
Tested: October 13, 2025
It works well with multishot. Unfortunately, enforced wide format on my tall start image. Sora's small context is a bummer, had to cut the prompt's length.
Tested: October 13, 2025
Vidu can do multi-shots! Here it does zoom first, then cut. In another try it cut both ways but I like this output better overall.
Are cigars visible only in two capybaras’ mouths (one on the right, one in the back with slicked fur)?
Do those cigars rest correctly beside their front incisors without floating or clipping?
Is there a slightly crooked portrait of a glamorous female capybara on the wall?
In the close-up, does the slick-furred capybara’s cigar glow clearly while his eyes narrow naturally in suspicion?
Is the glowing cigar tip rendered with realistic ember detail and light falloff?
Do camera cuts between wide and close-up shots remain smooth and consistent in lighting and smoke density?
During the wide shot, do the capybaras exchange cards and chips naturally, with no finger or object distortions?
Does the overall atmosphere (lighting, smoke, focus) convey a moody 1940s noir poker tension?
Camera Motions and Effects:
Starts with a smoky wide shot showing all four capybaras around the poker table.
Cuts to a close-up of the slick-furred capybara’s face and glowing cigar tip.
Lingers on the ember glow and smoke trail, then cuts back to a wide shot of the group.
Check out the results from GROK (Grok Imagine v0.9) vs Freepik (Seedance 1.0 Pro) vs Freepik (Sora 2) vs Vidu AI (Vidu Q2 Cinematic) for similar or identical prompts side-by-side.
Cheating couple caught in cafe
Timelapse city behind anthro animals