This is an AI video generation comparison for
reference-to-video
prompt:
A short video of this subject (woman) in a luxury boutique picking up high-heel pink shoes , closeup of shoes, then closeup of her intrigued excited face then her sitting down on the bench trying these shoes on.' with a start frame being a 4-panel image with the 4 key shots: woman looking at choes, then shoes closeup, then her excited face and the last panel being her sitting down trying on a shoe....
Log in to see full prompt.
Tested: February 18, 2026
This was done using FLOVA's AI Assistant. The first time you try their tools you're actually making credits not just spending, so before I've generated this final video I wa sactually 'in profit" with my storyline and elements generated :) Didn't care to keep regenerating elements when Nano Banana Pro insisted there be 3 shoes instead of two not once but twice.
Tested: May 3, 2026
Worked well with this simple prompt. Her 'intrigued excited face' is quite comical.
Tested: May 3, 2026
Completely garbled text on her skirt, and the facial details are close but other models have better identity fidelity going using the same reference image
Tested: May 3, 2026
This model is AI-influencer ready :) If you're too lazy to supply the text, it'll fill in the blanks for you. Make sure to enable 'multishot' for multi-panel references and if you need her to be silent say so in the prompt.
Tested: May 3, 2026
Kling 3 is totally capable of this storyboard setup. Might be taking a bit of time zooming into initial panel from the 4-panel start frame but then everything goes smoothly. Love that text looks decently preserved on her skirt print.
A structured multi-shot fashion retail sequence testing AI’s ability to follow storyboard order, maintain character consistency, and deliver clean product + reaction closeups in a boutique setting. NOTE: the source image contains a 3rd shoe in the last panel so please do not blame that on video generator ;)
Things to watch out for:
Does the sequence follow the correct order: browsing → shoe closeup → face reaction → trying on shoes?
Is there a clear closeup shot of pink high-heel shoes with sharp detail?
Are the shoes consistently pink high heels across all relevant shots (no design/color changes)?
Is there a closeup of her face showing a visibly intrigued/excited expression?
Do her facial features remain consistent between shots (no identity drift)?
Is she shown sitting on a bench while trying on the shoes in the final shot?
Check out the results from Flova.ai (Vidu Q3) vs Magnific (Freepik) (Seedance 2.0) vs Magnific (Freepik) (HappyHorse 1.0) vs PixVerse (PixVerse C1) vs Wan (Online Platform) (Wan 2.7) vs Kling AI (Kling 3.0) for similar or identical prompts side-by-side.
Model showcasing outfit with backpack