Audio Models - Parameters Reference

This document provides a comprehensive reference for the parameters available across various audio generation models in the Scenario API. Each model has a unique modelId and a set of specific parameters that can be used to control the audio generation process. Understanding these parameters is crucial for effectively utilizing the API to achieve desired audio outputs.

Below, you will find detailed information for each audio model, including its modelId, the types of parameters it accepts, allowed values, default settings, and a clear description of each parameter's function.

ElevenLabs

ElevenLabs V3

Model ID: model_elevenlabs-tts-v3

InputLabelTypeDefaultMinMaxAllowed ValuesNotes
textTextstringRequired. Up to 40k characters
voiceVoiceselectAria"Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill"
stabilityStabilitynumber0.501
similarityBoostSimilarity Boostnumber0.501
styleStyle Exaggerationnumber001
speedSpeednumber10.71.2<1 slows; >1 speeds up
previousTextPrevious Textstringoptional context
nextTextNext Textstringoptional context
languageCodeLanguage Codeselect""ISO 639‑1 codes

ElevenLabs Multilingual v2

Model ID: model_elevenlabs-multilingual-v2

InputLabelTypeDefaultMinMaxAllowed ValuesNotes
textTextstringRequired. Up to 40k characters
voiceVoiceselectAria"Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill"
stabilityStabilitynumber0.501
similarityBoostSimilarity Boostnumber0.501
styleStyle Exaggerationnumber001
speedSpeednumber10.71.2<1 slows; >1 speeds up
previousTextPrevious Textstringoptional context
nextTextNext Textstringoptional context
languageCodeLanguage Codeselect""ISO 639‑1 codes

Google Lyria 2

Model ID: model_lyria-2

InputLabelTypeDefaultNotes
promptPromptstringRequired. Up to 2048 chars
negativePromptNegative PromptstringExcludes elements from generation
seedSeednumberOptional reproducible seed

Meta MusicGen

Model ID: model_meta-musicgen

InputLabelTypeDefaultMinMaxAllowed ValuesNotes
modelVersionModel Versionselectstereo-melody-largestereo-melody-large, stereo-large, melody-large, large
promptPromptstringRequired if no input_audio
inputAudioInput Audiofileoptional conditioning
durationDurationnumber8130seconds
continuationContinuationbooleanfalsecontinues from input_audio
continuationStartStartnumber00start time (s)
continuationEndEndnumber0defaults to end
multiBandDiffusionMulti Band Diffusionbooleanfalseonly for non-stereo models
normalizationStrategyNormalization Strategyselectloudnessloudness, clip, peak, rms
temperatureTemperaturenumber1controls diversity
classifierFreeGuidanceGuidancenumber3010higher = more faithful
seedSeednumberoptional RNG seed

Minimax

Minimax Music 01

Model ID: model_minimax-music-01

InputLabelTypeDefaultAllowed ValuesNotes
lyricsLyricsstring""required. Supports newline and ## for accompaniment
songFileSong Filefilemust be >15s
voiceFileVoice Filefilerequired if lyrics given
instrumentalFileInstrumental Filefileinstrumental reference
sampleRateSample Ratenumber4410016000, 24000, 32000, 44100
bitrateBitratenumber25600032000, 64000, 128000, 256000

Minimax Music 1.5

Model ID: model_minimax-music-1.5

InputLabelTypeDefaultAllowed ValuesNotes
promptPromptstring10-300 characters required
lyricsLyricsstring10-600 characters
sampleRateSample Ratenumber4410016000, 24000, 32000, 44100
bitrateBitratenumber25600032000, 64000, 128000, 256000

Minimax Speech 02 HD

Model ID: model_minimax-speech-02-hd

InputLabelTypeDefaultMinMaxAllowed Values
textTextstring
voiceIdVoice IdselectWise_WomanWise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl
speedSpeednumber10.52
volumeVolumenumber1010
pitchPitchnumber0-1212
emotionEmotionselectautoauto, neutral, happy, sad, angry, fearful, disgusted, surprised
englishNormalizationEnglish Normalizationbooleanfalse
sampleRateSample Ratenumber320008000, 16000, 22050, 24000, 32000, 44100
bitrateBitratenumber12800032000, 64000, 128000, 256000
channelChannelselectmonomono, stereo
languageBoostLanguage BoostselectAutomatic(list of 25 language options)


Minimax Speech 02 Turbo

Model ID: model_minimax-speech-02-turbo

InputLabelTypeDefaultMinMaxAllowed Values
textTextstring
voiceIdVoice IdselectWise_WomanWise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl
speedSpeednumber10.52
volumeVolumenumber1010
pitchPitchnumber0-1212
emotionEmotionselectautoauto, neutral, happy, sad, angry, fearful, disgusted, surprised
englishNormalizationEnglish Normalizationbooleanfalse
sampleRateSample Ratenumber320008000, 16000, 22050, 24000, 32000, 44100
bitrateBitratenumber12800032000, 64000, 128000, 256000
channelChannelselectmonomono, stereo
languageBoostLanguage BoostselectAutomatic(list of 25 language options)


MM Audio

Model ID: model_mm-audio

InputLabelTypeDefaultMinMaxNotes
promptPromptstringrequired
videoVideofilerequired
negativePromptNegative Promptstring
durationDurationnumber8130seconds
numStepsStepsnumber25450
cfgStrengthGuidancenumber4.5120higher = closer to prompt
seedSeednumber-1-1 or blank = random


Resemble AI Chatterbox

Model ID: model_resemble-ai-chatterbox

InputLabelTypeDefaultMinMaxNotes
promptTextstringrequired
audioPromptVoice Cloning Reference Audio Filefileoptional reference
exaggerationExaggerationnumber0.50.252controls emotion strength
cfgWeightPace Weightnumber0.50.21higher = monotone
temperatureTemperaturenumber0.80.055randomness
seedSeednumber00=random