Audio Models - Parameters Reference

This document provides a comprehensive reference for the parameters available across various audio generation models in the Scenario API. Each model has a unique modelId and a set of specific parameters that can be used to control the audio generation process. Understanding these parameters is crucial for effectively utilizing the API to achieve desired audio outputs.

Below, you will find detailed information for each audio model, including its modelId, the types of parameters it accepts, allowed values, default settings, and a clear description of each parameter's function.

ElevenLabs

ElevenLabs V3

Model ID: model_elevenlabs-tts-v3

InputLabelTypeDefaultMinMaxAllowed ValuesNotes
textTextstringRequired. Up to 40k characters
voiceVoiceselectAria"Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill"
stabilityStabilitynumber0.501
similarity_boostSimilarity Boostnumber0.501
styleStyle Exaggerationnumber001
speedSpeednumber10.71.2<1 slows; >1 speeds up
previous_textPrevious Textstringoptional context
next_textNext Textstringoptional context
language_codeLanguage Codeselect""ISO 639‑1 codes

ElevenLabs Multilingual v2

Model ID: model_elevenlabs-multilingual-v2

InputLabelTypeDefaultMinMaxAllowed ValuesNotes
textTextstringRequired. Up to 40k characters
voiceVoiceselectAria"Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill"
stabilityStabilitynumber0.501
similarity_boostSimilarity Boostnumber0.501
styleStyle Exaggerationnumber001
speedSpeednumber10.71.2<1 slows; >1 speeds up
previous_textPrevious Textstringoptional context
next_textNext Textstringoptional context
language_codeLanguage Codeselect""ISO 639‑1 codes

Google Lyria 2

Model ID: model_lyria-2

InputLabelTypeDefaultNotes
promptPromptstringRequired. Up to 2048 chars
negative_promptNegative PromptstringExcludes elements from generation
seedSeednumberOptional reproducible seed

Meta MusicGen

Model ID: model_meta-musicgen

InputLabelTypeDefaultMinMaxAllowed ValuesNotes
model_versionModel Versionselectstereo-melody-largestereo-melody-large, stereo-large, melody-large, large
promptPromptstringRequired if no input_audio
input_audioInput Audiofileoptional conditioning
durationDurationnumber8130seconds
continuationContinuationbooleanfalsecontinues from input_audio
continuation_startStartnumber00start time (s)
continuation_endEndnumber0defaults to end
multi_band_diffusionMulti Band Diffusionbooleanfalseonly for non-stereo models
normalization_strategyNormalization Strategyselectloudnessloudness, clip, peak, rms
temperatureTemperaturenumber1controls diversity
classifier_free_guidanceGuidancenumber3010higher = more faithful
seedSeednumberoptional RNG seed

Minimax

Minimax Music 01

Model ID: model_minimax-music-01

InputLabelTypeDefaultAllowed ValuesNotes
lyricsLyricsstring""required. Supports newline and ## for accompaniment
song_fileSong Filefilemust be >15s
voice_fileVoice Filefilerequired if lyrics given
instrumental_fileInstrumental Filefileinstrumental reference
sample_rateSample Ratenumber4410016000, 24000, 32000, 44100
bitrateBitratenumber25600032000, 64000, 128000, 256000

Minimax Music 1.5

Model ID: model_minimax-music-1.5

InputLabelTypeDefaultAllowed ValuesNotes
promptPromptstring10-300 characters required
lyricsLyricsstring10-600 characters
sample_rateSample Ratenumber4410016000, 24000, 32000, 44100
bitrateBitratenumber25600032000, 64000, 128000, 256000

Minimax Speech 02 HD

Model ID: model_minimax-speech-02-hd

InputLabelTypeDefaultMinMaxAllowed Values
textTextstring
voice_idVoice IdselectWise_WomanWise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl
speedSpeednumber10.52
volumeVolumenumber1010
pitchPitchnumber0-1212
emotionEmotionselectautoauto, neutral, happy, sad, angry, fearful, disgusted, surprised
english_normalizationEnglish Normalizationbooleanfalse
sample_rateSample Ratenumber320008000, 16000, 22050, 24000, 32000, 44100
bitrateBitratenumber12800032000, 64000, 128000, 256000
channelChannelselectmonomono, stereo
language_boostLanguage BoostselectAutomatic(list of 25 language options)


Minimax Speech 02 Turbo

Model ID: model_minimax-speech-02-turbo

InputLabelTypeDefaultMinMaxAllowed Values
textTextstring
voice_idVoice IdselectWise_WomanWise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl
speedSpeednumber10.52
volumeVolumenumber1010
pitchPitchnumber0-1212
emotionEmotionselectautoauto, neutral, happy, sad, angry, fearful, disgusted, surprised
english_normalizationEnglish Normalizationbooleanfalse
sample_rateSample Ratenumber320008000, 16000, 22050, 24000, 32000, 44100
bitrateBitratenumber12800032000, 64000, 128000, 256000
channelChannelselectmonomono, stereo
language_boostLanguage BoostselectAutomatic(list of 25 language options)


MM Audio

Model ID: model_mm-audio

InputLabelTypeDefaultMinMaxNotes
promptPromptstringrequired
videoVideofilerequired
negative_promptNegative Promptstring
durationDurationnumber8130seconds
num_stepsStepsnumber25450
cfg_strengthGuidancenumber4.5120higher = closer to prompt
seedSeednumber-1-1 or blank = random


Resemble AI Chatterbox

Model ID: model_resemble-ai-chatterbox

InputLabelTypeDefaultMinMaxNotes
promptTextstringrequired
audio_promptVoice Cloning Reference Audio Filefileoptional reference
exaggerationExaggerationnumber0.50.252controls emotion strength
cfg_weightPace Weightnumber0.50.21higher = monotone
temperatureTemperaturenumber0.80.055randomness
seedSeednumber00=random