MiniMax Music 2.6: Four Stories We Want to Tell

Today we release MiniMax Music 2.6.
In past generations, we talked about how much stronger the model had become. This time, we want to tell it differently — through four people, four pieces of music, and four things that weren't possible before. Because a music model's upgrade doesn't live in spec sheets. It lives in "this time, someone used it to do something they couldn't do before."
@XiaoLiRanRan is an independent creator who has spent three years making guofeng (Chinese-style) short videos, always searching for usable guofeng music.
The most distinctive elements of guofeng are precisely the parts AI music has struggled with most.
The vibrato of the erhu, the breath pauses of the dizi, the sweeping of the guzheng, the dynamics of opera vocals — these aren't about instrument categories, they're about performance nuance. Previous AI music could recognize the "guofeng" tag, but the output sounded like a set of Chinese instrument samples mechanically stitched together — no breathing where there should be breathing, no pauses where there should be pauses.
Music 2.6 doesn't just support more instruments — it gives instruments a sense of horizontal, temporal progression. An opening can begin with just the empty space of drums, strings and plucked instruments layering in one by one, with melody and vocals pushing together to the climax. Set the atmosphere first, then bring in the melody — this is the discipline of traditional opera's opening percussion, and it's now how AI music truly understands the entry order of guofeng.
What she can do now: the BGM for a guofeng short video has gone from "finding existing, copyright-ambiguous material" to "writing a fully original score that matches the visual mood in 15 minutes."
@BenMingYanZu's problem is that he's building a solo-developed action game, and there's no line item for a scoring studio in his budget.
He used to have two options. One was to spend thousands on a sample library, but the "epic battle music" in a sample library is just a handful of tracks — he'd know from day one that players would hear repeats. The other was AI music generation, but previous AI music couldn't deliver the epic feel — drums were "loud" but not "heavy," the low end was muddy, and it couldn't drive the emotional pressure a boss fight demands.
Music 2.6 has undergone dedicated optimization of the mid-to-low frequency range. Bass and drums have noticeably improved in sub-bass depth and tightness — in practical listening terms: on bass-heavy headphones, in car speakers, on a gamer's sound system, the drums and bass won't blur into mush. They can actually push the visuals forward.
Combined with 2.6's improved understanding of song structure, he can now write in his prompt: "Start with an oppressive atmosphere, gradually transition to an awakening of power, then build to an eruption of invincibility" — and the model actually follows that structure.
What he can do now: score an entire boss-fight soundtrack for an indie game, reducing the cost from thousands of dollars to a single afternoon.
@MissNanFangYi runs a specialty coffee shop and needs a playlist for the 2 PM to 6 PM window every day, but she's never been able to find the right one.
Streaming playlists labeled "cafe music" are either too "background" — so smooth they have no presence, might as well not be playing — or too "performative" — a sax solo so assertive it drowns out customers' conversations. What a cafe actually needs is music that meets a counterintuitive standard: it must be good enough to be noticed, yet restrained enough not to be disliked. That "just right" balance is something previous AI music could almost never achieve.
Music 2.6 has a subtle but important change in how it handles vocals and melody: it allows "imprecision." In the right stylistic range — lo-fi, indie folk, indie jazz — that imprecision becomes the breathing rhythm of the groove.
In her track "Desert Race," the vocals carry a late-night sense of casualness and defiance, like another self singing a song written for herself. The low-frequency bass and drums are nearly equal in energy to the midrange vocals, while the highs are deliberately suppressed into a warm, dark tone. Play it in a cafe and no one will be annoyed — but occasionally a customer will pause and ask, "Who's singing this?"
What she can do now: stop endlessly browsing playlists and simply tell 2.6 the mood and atmosphere she wants — "late-night feel, urban, slightly tipsy, not too bright."
@NYX is the one among the four who, until now, was the furthest from being able to do what she wanted.
She can't arrange music, doesn't know any musicians, and has a limited budget. What she wants to do is simple — take her mom's favorite song from when she was young and remake it in her own style, as a birthday surprise.
This wasn't possible before. Not because AI couldn't generate music, but because she didn't want "a new song" — she wanted "that song in a different form." The melody had to be the one her mom would recognize; only the style, arrangement, and atmosphere would change. This is a task that requires using an existing song as a precise constraint.
This is Music 2.6's new feature: Cover.
You upload a song, and the model precisely extracts its melodic skeleton, then lets you decide everything beyond that skeleton — style can jump from folk to heavy metal, arrangement can shift from classical symphony to cyberpunk electronic, and you can even keep the melody while completely replacing the lyrics.
"Auld Lang Syne" original:
Cover version:
What she can do now: the night before her mom's birthday, spend half an hour creating a gift that used to require an entire arranging team.
Those were four stories. But Music 2.6's upgrades extend far beyond these four people —
First-packet latency reduced to under 20 seconds. After you finish writing your prompt, just one deep breath later you'll hear the first feedback. The "waiting" feeling is essentially gone.
Instruction control comprehensively enhanced. BPM, key, song structure, emotional arc — all can be written into your prompt and accurately executed by the model. The specific requirements you write down, the model takes seriously.
Mid-to-low frequency acoustics systematically optimized. Beyond the game scoring scenario mentioned earlier, any style that demands strong low end — House, Trap, Drum & Bass — will directly benefit.
These capabilities appeared in all four stories above, but their applicability extends far beyond those four.
All four people above worked directly with Music 2.6. But if you're an AI Agent developer, you're probably not thinking about making a song yourself — you're thinking about how to let your Agent make a song.
That's what the three Music Skills we're open-sourcing alongside this release are designed to solve:
• minimax-music-gen: Give your Agent complete music generation capabilities. Describe what you need in one sentence, and the Agent automatically identifies intent, selects the mode (original/instrumental/Cover), and calls the generation API.
• minimax-music-playlist: Turn your Agent into a personal music curator. It scans your local music apps, builds a profile of your taste, and generates an entire custom playlist for you.
• buddy-sings: Let your virtual companion sing. Integrated with OpenClaw, the Agent reads your defined character persona, builds a unique voice identity, and improvises songs in first person as that character. Give it a try — let Moth, who's been working alongside you, sing a song ~
None of these four stories were scripted by Music 2.6. They were things these people wanted to do themselves — we just used AI to make it possible.
Starting today, MiniMax Music 2.6 opens its global creative beta — 14 days free, and you're invited to create.
• Consumer users: 500 free creations per account per day
• Developers: Existing Token Plan users receive an additional 100 free API calls per day
Behind every piece of music is someone becoming a music creator for the first time in the age of AI.
What's the thing you've been wanting to do?
Product: minimaxi.com/audio/music
API: platform.minimaxi.com/docs/api-reference/music-generation
Intelligence with Everyone.