OmniGen2

OmniGen2 is free open-source AI that edits images writes prompts blends visuals and keeps improving itself thanks to a smart feedback loop

Visit This Site

Overview

OmniGen2 is one of those tools that helps with image editing and character consistency and handles stuff like "make the dress red" or "add a smile". Even lets you mix up images like “make guy from photo 1 kiss the girl in photo 2.”

Built on top of Qwen"‘VL"‘2.5 this thing uses two separate engines inside one brain. One for text another for images so it stays on point and doesn't mess things up.

It also reflects on its own outputs. Not like deep emotional thoughts or whatever but it checks itself and improves results on the fly. That bit's called "multimodal reflection" and it helps clean up mistakes as it works.

Who made this? A group called VectorSpaceLab. They dropped the first version back in Sept 2024 and now this one's packed with even more tricks. They've got plans to share the code the weights and test sets soon too.

It’s 100% free with no paid tiers. You get the whole thing including demos datasets code even the training weights. Just run it locally and let your hardware decide the limits.

OmniGen2 natively requires an NVIDIA RTX 3090 or an equivalent GPU with approximately 17GB of VRAM. For devices with less VRAM, you can enable CPU Offload to run the model.

Try it live on HuggingSpace (see links section)

Links

Educators and Trainers Creative Professionals Content Creators Media and Film Makers Marketing and Branding Specialists Developers and Tech Creators Nonprofit and Advocacy Creators Small Business Owners Entertainment and Performance Artists Professional Content Creators

This tool is free to use when installed locally and is offered under Apache License 2.0.

OmniGen2 is grabbing attention for being free open-source and local-first but the crowd has mixed feelings. Some call it the closest thing we’ve got to a local ChatGPT others say it's undercooked and slow.

Plenty of folks are bumping into install headaches especially on Windows or AMD. Getting Flash Attention and Triton to work is its own side quest.

Real usage feels hit-or-miss. Users say it “takes inspiration” from source images but doesn’t always merge them cleanly. One ran it on a 4090 and still hit 4 minutes per image. Another called the outputs “burned out” or “grainy.” Still someone with a 5060TI managed 2–3 minute runs and called the quality “good.” Comments showed users experimenting with face merging and pose changes and though the tool impressed many some still ran into slow loads or GPU limits. One user summed it up saying “damn — it loads really slowly... lower the steps to 20 doesn’t reduce quality that much” while another asked “any easy stand-alone installer?”

Biggest wish right now? A cleaner setup and more optimization. Folks are asking for Docker templates easier configs and real checkpoint files. Until then it’s DIY vibes all the way.

One thing’s clear: people want this to work. They're tweaking quantization hacking ComfyUI nodes and forcing it to run on all kinds of hardware just to get a taste of what full local control might feel like.

[ Reddit ]

Prompt:

Let the woman hold the soda can in her hand as if advertising it

Generated on June 26, 2025:

Image output — Six fingers and not a perfect face match but pretty close, and very impressive soda can text handling!

Rating:

Favorite

Useful Links

No additional links available for this tool.

This page was last updated on July 4, 2025 at 10:16 AM

OmniGen2

Overview

Tags

Links

What can it do?

Who is it for?

How much does it cost?

Community feedback and reviews

OmniGen2 examples

Useful Links