AI creators tools

HeartTranscriptor audio model

Name: HeartTranscriptor
Also Known As: Heart Transcriptor, HeartTranscriptor-oss
Licence: Apache License 2.0
Creator: HeartMuLa Team

HeartTranscriptor dropped Jan 2026. It’s part of the HeartMula setup and handles the audio-to-text part.

You give it an audio clip, it spits out the words. Works for lyrics and plain speech too. So calling it a lyrics transcriber isn’t the best name, since it handles regular talking just fine.

You don’t need a powerful rig. It runs on 6–8 GB VRAM. Even works on CPU if you don’t mind it being slow.

You use it in ComfyUI. Just drag the HeartTranscriptor node in, hook up an audio input and connect a text output to show the words.

It does pretty well with clear audio. Singing can trip it up sometimes, like if someone hits a high note it might mishear stuff, like “VRAM” sounding like “VROM.” It’s better with normal talking.

Easy to plug into your setup. You just swap out your old audio-to-text node with this one. It has fewer settings and looks cleaner too.

Key Features
No performance evaluations available for this model yet.
No sample outputs available for this model yet.

Where To Find HeartTranscriptor

If you'd like to access this model, you can explore the following possibilities:

Other Models by HeartMuLa Team

Useful Links
HeartMula Transcript Usage Example

Tutorial

How to use this node in ComfyUI