MiniMax Speech 2.8 HD is a new text-to-speech model that started popping up online around late January 2026. It's built by MiniMax, a Shanghai AI company that also made Hailuo and Talkie. The model focuses on making high-quality voice that's fit for pro-level stuff like audiobooks, videos, and voiceovers.

The audio sounds sharp and clear. It's made to feel real and smooth, like something you'd hear in a studio.

You can tweak emotions too. Just type things like (laughs) or (sighs) and it'll say it like a person would. Feels more natural.

There’s over 17 built-in voices. Different genders, ages, and ways of speaking. It also works with many languages.

The tech behind it includes a Flow-VAE decoder and autoregressive Transformer, which help make the voice sound fuller.

You can use it through APIs on places like WaveSpeedAI and Replicate. They’ve got SDKs and REST options so devs can plug it in easy.

It’s made for creators. People are using it for voiceovers, audiobooks, even reading scripts for video. It also works in apps for things like accessibility or games.

You can change pitch, speed, and emotion. There’s even a way to clone a voice using just a short clip.

Key Features

Supported Languages

Model Performance Editor’s Rating

No editor performance evaluations available for this model yet.

User Ratings

Censorship

Lower = less censorship. Higher = stricter filtering.

Creativity

Expressiveness

Generation Speed

ID preservation

Prompt Following

Realism

Speech 2.8 HD audio model

Key Features

Supported Languages

Model Performance Editor’s Rating

User Ratings

Speech 2.8 HD Examples

Where To Find Speech 2.8 HD

Other Models by MiniMax

Related Audio Models