ACE-Step
ACE-Step is a free AI tool that turns text into full songs fast. Open-source and ready for remixing.
Overview
ACE-Step is an AI model that makes full-length tracks in under half a minute. You type a prompt and boom 4 minutes of music hits back in 20 seconds (if you're lucky enought o be using something like an A100 GPU). No more waiting around or getting half-finished loops. With consumer grade GPU's you have to be more patient.
Hardware Performance
Device | 27 Steps | 60 Steps |
---|---|---|
NVIDIA A100 | 27.27x | 12.27x |
RTX 4090 | 34.48x | 15.63x |
RTX 3090 | 12.76x | 6.48x |
M2 Max | 2.27x | 1.03x |
RTF (Real-Time Factor) shown - higher values indicate faster generation
This thing runs on diffusion tech like what you’d see in image AI and pairs it with a compression encoder from Sana and a lightweight transformer. The combo keeps things quick without killing quality.
You can control how long the song is and mess with the melody rhythm and harmony.
The model supports various training-free applicaitons:
- retake: regenerate a variation of the same song.
- repaint: regenerate a specific part of the song.
- edit: modify the lyrics of the song.
Got an old track? You can tweak it, remix it or use this thing as a base to build voice cloning tools or music apps. It's all open-source under Apache 2.0 so devs can do whatever they want with it.
It's not perfect. Go too long and the structure can get a bit mushy. Weird instruments or styles like Chinese rap? Might sound off. Output quality also changes based on the seed so it's a bit of a lucky draw.
🔔 Important Notice: The only official website for the ACE-Step project is their GitHub Pages site. They do not currently operate any other websites.
Supported Languages
- Arabic
- Chinese
- Czech
- Dutch
- English
- French
- German
- Hindi
- Hungarian
- Italian
- Japanese
- Korean
- Polish
- Portuguese
- Russian
- Spanish
- Turkish
Tags
Freeware Apache License 2.0 PC-based #Voice & AudioLinks
This tool is free to use when installed locally and is offered under Apache License 2.0.
Plan Name | Tier Type |
---|---|
Free | free |
Most folks reacting to ACE-Step are surprised at how solid this thing is for being open-source and only 3.5B params. It’s quick fun and decent for casual use. That said, nobody’s calling it pro-level yet.
What people like
Speed. It’s super fast. Even users on RTX 3090 cards are cranking out full songs in under 30 seconds.
Free and open. No paywall no hidden terms just download and go.
Simple pop tracks work best. The model nails basic catchy songs.
Random is fun. That “gacha-style” randomness? Makes it a blast to play with different prompts and seeds.
What’s not quite working yet
Vocals feel robotic. Voices sound flat and sometimes distorted.
Results vary. Some prompts work great others flop hard.
Genre blind spots. Death metal and fast rap? It struggles.
Lyrics don’t always sync. Especially with languages like German or Japanese the model taps out early.
Control could be better. Users want more say over intros outros volume and overall flow.
User tips and notes
ComfyUI users report more compression artifacts than the site demo.
Tags like [JP] or [RU] can help guide lyric languages.
Length settings are a bit weird some got longer songs than expected without realizing why.
ACE-Step is a fun fast model for anyone dabbling in AI music. It’s not polished but it gets the job done if you're into tinkering remixing or just messing with music prompts. [ Reddit: 1, 2 ]
Useful Links
Ace-Step Audio Model Native Support in ComfyUI
Tutorial
ComfyUI now supports Ace-Step natively. This documentation explains how to start running this model.
This page was last updated on May 13, 2025 at 8:36 PM