Qwen-Image is a 20B parameter AI model made for image generation and editing released by Alibaba in August 2025.
The Qwen series has models like Qwen-7B, Qwen-VL for vision-language stuff, and Qwen-Image. They're part of Alibaba's open-source work and use open licenses like Apache 2.0 for Qwen-Image. Qwen is Alibaba’s main push into large language and multimodal AI, and it’s often compared to models from Meta like LLaMA, Google’s Gemini, and OpenAI’s GPT.
It’s built to handle tricky text like Chinese inside images and can do detailed image edits. Works for both making new pictures and changing old ones.
You can throw in prompts for different art styles. It covers a lot - photo-style pics, anime looks, minimalist designs, even impressionist stuff. And it keeps text sharp no matter the script, like English or Chinese.
The model mixes a language-vision setup, a custom layout tool, and its main image engine. That helps it keep fonts, layout, and design stuff lined up right. It’s useful for things like posters, slides, or app mock-ups.
Outside of creating stuff, it also edits images well. Think style changes, fixing poses, adding or deleting stuff, even tweaking small details. It stays steady through edits too.
It also gets what’s going on in a pic. It can find objects, spot edges, guess depth, switch views, and sharpen low-res images.
Qwen-Image hit top marks in test sets like GenEval, ImgEdit, and ChineseWord.
Available in ComfyUI natively.
If you'd like to access this model, you can explore the following possibilities:
Version
Qwen-Image Boring Reality - experimental LoRAs which help you create super reslistic images with Qwen
Version
Qwen-Image has been distilled to run in 8-steps. You get nearly the same image quality, with >50% less compute required.
🎮 Spot this model in a game!
Hidream I1 vs Flux Schnell vs Qwen-Image - in our "Which AI Made This?" guessing game.
Play Now