AI creators tools

Wan (Open-source)

Wan by Alibaba’s Wan team is an open-source AI suite for generating videos from text and images. It handles motion physics, text rendering, and more—leading the VBench benchmark. Free to use under Apache 2.0.

Overview

Wan2.1 is an open-source AI video tool built by Alibaba’s Wan team. It transforms text and images into high-quality videos while handling motion dynamics, physics, and text rendering in both Chinese and English. It’s not just another AI model - it leads the VBench leaderboard, outperforming both open-source and commercial competitors.

The best part is that it’s free AND supports Consumer-grade GPUs: The T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes. Mind you, the size of the model is gigantic, dozens of gigabytes, so must have a lot of disk space available.

Wan2.1 is fully open-source under the Apache 2.0 license. No hidden fees, no subscriptions—just grab the code and start creating. 

You can start using Wan2.1 right away. Check out their GitHub for installation and usage instructions. Pre-trained models are also available on Hugging Face and ModelScope for easy integration.

There's a GGUF version for Comfy UI.

Wan2.2 is now supported in ComfyUI from Day 0!

Features include cinematic control, complex motion handling & precise semantics—all with MoE architecture.
Available in FP16 & FP8, with 5B models running on 8GB VRAM via auto-offloading.

Tags

Freeware Apache License 2.0 PC-based #Video & Animation

Educators and Trainers Creative Professionals Content Creators Media and Film Makers Marketing and Branding Specialists Developers and Tech Creators Nonprofit and Advocacy Creators Small Business Owners Entertainment and Performance Artists Professional Content Creators

This tool offers the following AI models:

This tool is free to use when installed locally and is offered under Apache License 2.0.

People are testing the WAN 2.1 I2V model using compressed GGUF files in ComfyUI, mainly on RTX 3060 cards. Here's what they're seeing:

Performance on RTX 3060

  • 416x416, 25 steps → About 9 minutes for 2 seconds of video
  • 512x512, 25 steps → Around 13.5 minutes for 2 seconds

Key Resources & Process

  • GGUF compressed models are up on Hugging Face
  • Basic setup guide available [here]
  • More details on the ComfyUI example page [here]

Hardware Tips

  • Hardware used: 12GB VRAM, 48GB RAM (extra RAM helps a lot)
  • Some users get by with 16-32GB RAM

Choosing Compression Levels

  • Q4_0 was used, but higher levels (bigger files) give better quality

    How It Compares

    More stable than SkyReels. Less "melting" effect than some other tools.

    [ Reddit ]

    Prompt: The sky is alive with energy as two giant flying turtles soar majestically above a blurred landscape of mountains and rivers, carrying serene monks in meditation atop their shells. The turtles' eyes suddenly blaze with an intense blue glow, crackling bolts of electricity erupting into the air, illuminating the turbulent sky with flashes of supernatural power. As the lightning flickers, an intense wind bursts forth, thrashing the monks' flowing robes, creating a dynamic and chaotic atmosphere. The turtles split apart in a dramatic arc, veering left and right as they abandon the scene, leaving the monks momentarily isolated in the vast expanse. The focus shifts smoothly to the empty valley below, where mist thrums with energy, swirling dynamically Compare Tools

    Generated on May 23, 2025:

    Wan 2.1 Plus has done a great job

    Prompt: Double dolly shot: A fierce and determined woman stands in the center of a dusty Western street, her expression focused and intense as she points a shotgun towards an unseen foe. The camera glides backward, while the town around her drifts ominously, buildings and debris swirling in the distance. She remains unnaturally steady, her hair swirling in a slight breeze, amplifying the tension of the moment. This visual dissonance captures her psychological struggle, symbolizing her inner strength. The harsh sunlight casts long shadows, enhancing the gritty texture of her rugged attire and the arid landscape. Compare Tools

    Generated on May 22, 2025:

    Wan 2.1 14b quantized image to video 480p Q4_K_S gguf. Online version censors this image and won't generate anything wit it. No double dolly visible here anyway.

    Prompt: Surreal scene of a delicately ornate porcelain mug with an embossed gold floral pattern along the rim and upper edges, featuring a highly realistic detailed nature scene wrapping around the central surface of the mug. In the middle of the design, a flamingo stands gracefully in a small reflective water pond, surrounded by lush green reeds, blooming wildflowers, and delicate pink and white blossoms. The painted lake water ripples and overflows, spilling realistically from the lower edge of the cup onto the table. The pool of water on the table grows larger and larger, soon surrounding the cup. Several yellow butterflies flutter around the flamingo—some in mid-air, some perched on flowers. The background within the design has a soft pastel sky with a dreamy, painterly gradient from light blue to warm beige. The mug is placed on a neutral grey surface, and a few tiny flower petals are scattered around its base, reinforcing the ethereal and serene atmosphere. The handle is elegantly curved and highlighted with subtle golden detailing. Compare Tools

    Generated on May 19, 2025:

    Wanx 2.1 Plus, generated on their online platform, with sound added there too

    Prompt: An elderly lady is driving her electric wheelchair forward at 140 km/h on a winding mountain road. Due to the high speed, her scarf quickly flaps in front of the camera lens, which is 10 meters behind the wheelchair and moves with the vehicle. The camera is shot from behind; the elements at the side of the road are blurred due to the high speed. The lady is driving very fast and overtaking other cars. This realistic shot was taken with 35mm film. Compare Tools

    Generated on May 17, 2025:

    Wan2.1 Plus text to video produced a really nice result. Sound effect also by Wan.

    Prompt: A natural beauty blonde woman with a confident and assertive pose is in the center looking at the viewer with fierce focused annoyed gaze, wearing a slightly dirty Western-inspired outfit, including a Western-style hat, a dress with a vest-like undergarment, and a feminine yet strong aesthetic. She is pointing a large 8-Gauge Shotgun (aka 8-bore) straight in front of herself, at the viewer, with the massive barrel dominating the foreground and appearing exaggerated in size due to dramatic perspective, creating an intense visual focus. Her skin is slightly dirty. The style is a realistic photograph with a focus on fashion and a cinematic quality, a still from a Western-themed film. The background consists of a series of buildings, a hint of a mountain range, and a rustic atmosphere. Compare Tools

    Generated on May 17, 2025:

    Image output
    Generated by Wan2.1 Plus. This is best of 4 outputs. Good effort, but barrel is obviously not the best. Cinematic vibe - great.

    Prompt: A man lunges forward, throwing a lightning-fast jab at a goat in a red hoodie. The goat ducks effortlessly, its horns grazing the air, then counters with a powerful left hook that sends sweat flying. Their gloves collide with a thunderous impact as the camera whips around them in a tight 360-degree spin. Neon lights flicker overhead while dust and spit explode into slow motion. Every strike lands like a cinematic thunderclap. Compare Tools

    Generated on May 3, 2025:

    Wanx 2.1 through Pollo.ai

    Prompt: A sausage dog wearing stylish wind goggles drives a gleaming chrome motorcycle, its long ears flapping wildly in the breeze, and its mouth open in an excited, playful expression. The dog looks thrilled as it grips the handlebars tightly. In the front basket of the motorcycle, a ginger-and-white cat sits energetically, its fur tousled by the wind, with wide, excited eyes and an open-mouthed expression of joy. The background features a vast countryside road stretching into the distance, lined with golden fields and distant mountains under warm golden sunlight. The entire scene exudes quirky, dynamic energy with a fun and cinematic vibe. Compare Tools

    Generated on February 28, 2025:

    Test in local Comfy UI install of 1.3 billion Wan model

    Prompt: Close-up shot of a woman’s tear-filled eyes as she pleads during a heated argument with her partner, seen from his back. The camera slowly zooms in on the tears streaking her flushed cheeks, the soft glow of kitchen lights barely illuminating the scene behind her. Compare Tools

    Generated on February 28, 2025:

    Test in local Comfy UI install of 1.3 billion Wan t2v model

    Prompt: The scene begins with a close-up of striking red high heels, sharp and polished, walking away on a fractured asphalt road. The camera remains low to the ground, fully focused on the legs as they walk with deliberate confidence. The camera steadily tracks the legs from behind, capturing their motion as they stride through a desolate, post-apocalyptic street. Compare Tools

    Generated on February 28, 2025:

    1.3 billion Wan2.1

    Prompt: A sleek humanoid robot performs a mesmerizing dance in a stark, minimalistic futuristic room. The seamless white walls glow softly with pulsing lines of light, creating a sharp contrast against the robot’s dark figure. The camera starts with a static wide shot, gradually moving into a slow orbit around the robot, capturing its fluid, precise movements. The room’s dynamic lighting synchronizes with the rhythm of the dance, casting glowing patterns and subtle shadows on the floor and walls. The overall scene combines elegance, sophistication, and a high-tech sci-fi aesthetic. Compare Tools

    Generated on February 28, 2025:

    Image to Video done in Comfy UI's Wan2.1-I2V-14B-480P-gguf version Q4

    Prompt: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse.

    Generated on February 26, 2025:

    Fal.ai published video example for WAN t2v

    Prompt: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage.

    Generated on February 26, 2025:

    Fal.ai published video example for WAN image-2-video

    Latest Wan (Open-source) News

    July 30, 2025

    ComfyUI dropped a new patch for Wan 2.2. It cuts VAE decoding memory use by about 10%. The 5B I2V model got a big boost too. They also added new setup templates for the 14B models. You can grab the latest version on Git, Portable, or Desktop.

    feature

    July 29, 2025

    ComfyUI adds native Wan2.2 support.
    Available in FP16 & FP8, with 5B models running on 8GB VRAM via auto-offloading.

    feature

    July 28, 2025

    Wan 2.2 is out now! Featuring a faster model, open-source access, and upgraded tools from Tongyi Lab at Alibaba Group. Improved prompt following, motion control, and visual detailing make it ideal for cinematic AI creation.

    model

    July 3, 2025

    WanGP v6.5 is here!
    Built-in MMAudio (low VRAM)
    Post-gen upsampling & audio options
    MagCache, SageAttention2++
    Video2Video via Text2Video
    FusioniX upsampler/restorer

    model

    April 24, 2025

    WAN now offeres a Creator Partnership Program.

    Useful Links
    Wan2.2 ComfyUI Official Native Workflow Example

    Tutorial

    Official usage guide for Alibaba Cloud Tongyi Wanxiang 2.2 video generation model in ComfyUI

    Wan 2.2 with 8GB VRAM

    Workflow

    The OP used Wan 2.2 image-to-video GGUF model Q6 with image-to-video ligtx2v lora. They shared how they tweaked the ComfyUI default workflow.

    Drone-style push‑in motion LoRA for Wan 2.1

    LoRA

    A new LoRA for Wan 2.1 brings realistic drone-style push‑in motion 🎥 Trained on 100 clips and refined through 40+ versions, it includes a ComfyUI workflow and is triggered by “Push‑in camera.” Perfect for adding cinematic movement to your videos.

    Fast 4 steps Wan 2.1 I2V (14B) with CausVid LoRA

    Version

    CausVid is a distilled version of Wan 2.1 to run faster in just 4-8 steps, extracted as LoRA by Kijai and is compatible with 🧨 diffusers

    Wan2.1 GP by DeepBeepMeep

    Other

    Open and advanced large-scale video generative models for the GPU poor

    ComfyUI Workflows + Guide for Wan2.1

    Workflow

    Using native ComfyUI nodes and using the kijai wan wrapper nodes allowing for more features.

    Viral Effects LoRAs for Wan2.1 14B 480p I2V

    LoRA

    A wide range of open-sourced effects - such as Assassin, Jungle, Pirate Captain, Baby, Princess, Painting, Warrior, Samurai, Snow White, Bride, Mona Lisa, Zen, VIP, Puppy, Classy, and Disney Princess-all available on Hugging Face under the Apache 2.0 license trained on Wan2.1 14B I2V 480p model.

    This page was last updated on July 30, 2025 at 9:26 AM