Skip to content

feat: add audio file save support to SpeechSDK and MusicSDK#147

Open
NianJiuZst wants to merge 1 commit into
MiniMax-AI:mainfrom
NianJiuZst:feat/sdk-audio-save
Open

feat: add audio file save support to SpeechSDK and MusicSDK#147
NianJiuZst wants to merge 1 commit into
MiniMax-AI:mainfrom
NianJiuZst:feat/sdk-audio-save

Conversation

@NianJiuZst
Copy link
Copy Markdown
Contributor

Summary

Closes #146

Adds save() methods to SpeechSDK and MusicSDK that decode hex-encoded audio from API responses and write it to disk.

Motivation

Both synthesize() and generate() return hex strings in response.data.audio. Saving these to files currently requires users to:

  1. Validate hex format manually
  2. Decode with Buffer.from(hex, 'hex')
  3. Create output directories
  4. Generate filenames
  5. Handle disk-full errors

This is boilerplate the CLI already handles. The SDK should too.

Usage

import { MiniMaxSDK } from 'mmx-cli/sdk';

const sdk = new MiniMaxSDK({ apiKey: 'sk-...' });

// Speech
const tts = await sdk.speech.synthesize({ text: 'Hello world!' });
const speechPath = sdk.speech.save(tts);
// → speech_2026-05-13-15-30-00.mp3

// Or with explicit path
sdk.speech.save(tts, '/tmp/greeting.wav', 'wav');

// Music
const music = await sdk.music.generate({ prompt: 'Upbeat pop', lyrics: '...' });
const musicPath = sdk.music.save(music);
// → music_2026-05-13-15-31-00.mp3

Files changed

File Change
src/sdk/speech/index.ts +50 lines — save() method + shared hex helpers
src/sdk/music/index.ts +55 lines — save() method + shared hex helpers
test/sdk/speech.test.ts +35 lines — 4 save test cases
test/sdk/music.test.ts +35 lines — 4 save test cases

Screenshots

Add `save()` methods that decode hex-encoded audio from API responses
and write it to disk, matching the CLI's audio output behavior:

SpeechSDK.save(response, outPath?, ext?)
- Decodes hex audio data from synthesize() responses
- Validates hex format before decoding
- Creates intermediate directories as needed
- Generates timestamp-based default filenames (speech_<ts>.mp3)
- Handles disk-full (ENOSPC) errors gracefully

MusicSDK.save(response, outPath?, ext?)
- Same hex-decode logic for generate() responses
- Default filename prefix: music_<ts>.mp3
- Same validation and error handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SpeechSDK and MusicSDK lack audio file save capability

1 participant