ACE-Step

ACE-Step is a free AI tool that turns text into full songs fast. Open-source and ready for remixing.

Visit This Site

Overview

ACE-Step is an AI model that makes full-length tracks in under half a minute. You type a prompt and boom 4 minutes of music hits back in 20 seconds (if you're lucky enought o be using something like an A100 GPU). No more waiting around or getting half-finished loops. With consumer grade GPU's you have to be more patient.

Hardware Performance

Device	27 Steps	60 Steps
NVIDIA A100	27.27x	12.27x
RTX 4090	34.48x	15.63x
RTX 3090	12.76x	6.48x
M2 Max	2.27x	1.03x

RTF (Real-Time Factor) shown - higher values indicate faster generation

This thing runs on diffusion tech like what you’d see in image AI and pairs it with a compression encoder from Sana and a lightweight transformer. The combo keeps things quick without killing quality.

You can control how long the song is and mess with the melody rhythm and harmony.

The model supports various training-free applicaitons:

retake: regenerate a variation of the same song.
repaint: regenerate a specific part of the song.
edit: modify the lyrics of the song.

Got an old track? You can tweak it, remix it or use this thing as a base to build voice cloning tools or music apps. It's all open-source under Apache 2.0 so devs can do whatever they want with it.

It's not perfect. Go too long and the structure can get a bit mushy. Weird instruments or styles like Chinese rap? Might sound off. Output quality also changes based on the seed so it's a bit of a lucky draw.

ðŸ”” Important Notice: The only official website for the ACE-Step project is their GitHub Pages site. They do not currently operate any other websites.

Supported Languages

Arabic
Chinese
Czech
Dutch
English
French
German
Hindi
Hungarian
Italian
Japanese
Korean
Polish
Portuguese
Russian
Spanish
Turkish

Links

Educators and Trainers Creative Professionals Content Creators Media and Film Makers Marketing and Branding Specialists Voice and Audio Professionals Developers and Tech Creators Nonprofit and Advocacy Creators Small Business Owners Entertainment and Performance Artists Professional Content Creators

This tool is free to use when installed locally and is offered under Apache License 2.0.

Most folks reacting to ACE-Step are surprised at how solid this thing is for being open-source and only 3.5B params. It’s quick fun and decent for casual use. That said, nobody’s calling it pro-level yet.

What people like

Speed. It’s super fast. Even users on RTX 3090 cards are cranking out full songs in under 30 seconds.
Free and open. No paywall no hidden terms just download and go.
Simple pop tracks work best. The model nails basic catchy songs.
Random is fun. That “gacha-style” randomness? Makes it a blast to play with different prompts and seeds.

What’s not quite working yet

Vocals feel robotic. Voices sound flat and sometimes distorted.
Results vary. Some prompts work great others flop hard.
Genre blind spots. Death metal and fast rap? It struggles.
Lyrics don’t always sync. Especially with languages like German or Japanese the model taps out early.
Control could be better. Users want more say over intros outros volume and overall flow.

User tips and notes

ComfyUI users report more compression artifacts than the site demo.
Tags like [JP] or [RU] can help guide lyric languages.
Length settings are a bit weird some got longer songs than expected without realizing why.

ACE-Step is a fun fast model for anyone dabbling in AI music. It’s not polished but it gets the job done if you're into tinkering remixing or just messing with music prompts. [ Reddit: 1, 2 ]

Prompt:

A sad-sweet English indie song with soft, whispery singing. Sounds close and lo-fi, like someone recording in their bedroom while going through stuff they’re not ready to move on from. The singer thinks back on a thing they had after hearing an old voicemail. The song plays out like a diary... bits of late-night drives, smoking in the rain, and promises that didn’t last. No yelling, just quiet sadness and leftover feelings hanging around like static on a tape.

Compare Tools

Generated on July 27, 2025:

Seems like it will generate made-up language lyrics despite having English in prompt. Supply your own lyrics.

Prompt:

A mid-tempo electronic track blending traditional accordion melodies with ambient synth pads and a trip-hop drum loop. Slightly moody but groovy.

Compare Tools

Generated on July 27, 2025:

Pretty great result, accordion is there throughout

for link to original generation.

Rating:

Favorite

Useful Links

Web Interface for ACE-Step and others

Other

A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio, Piper TTS, GPT-SoVITS, CosyVoice, XTTSv2, DIA, Kokoro, OpenVoice, ParlerTTS, Stable Audio, MMS, StyleTTS2, MAGNet, AudioGen, MusicGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, and Bark

Added on: August 7, 2025

Ace-Step Audio Model Native Support in ComfyUI

Tutorial

ComfyUI now supports Ace-Step natively. This documentation explains how to start running this model.

Added on: May 13, 2025

This page was last updated on August 7, 2025 at 9:40 AM

ACE-Step

Overview

Supported Languages

Tags

Links

What can it do?

Who is it for?

How much does it cost?

Community feedback and reviews