Audio Models - Parameters Reference
This document provides a comprehensive reference for the parameters available across various audio generation models in the Scenario API. Each model has a unique modelId
and a set of specific parameters that can be used to control the audio generation process. Understanding these parameters is crucial for effectively utilizing the API to achieve desired audio outputs.
Below, you will find detailed information for each audio model, including its modelId
, the types of parameters it accepts, allowed values, default settings, and a clear description of each parameter's function.
ElevenLabs
ElevenLabs V3
Model ID: model_elevenlabs-tts-v3
Input | Label | Type | Default | Min | Max | Allowed Values | Notes |
---|---|---|---|---|---|---|---|
text | Text | string | – | – | – | – | Required. Up to 40k characters |
voice | Voice | select | Aria | – | – | "Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill" | |
stability | Stability | number | 0.5 | 0 | 1 | – | |
similarity_boost | Similarity Boost | number | 0.5 | 0 | 1 | – | |
style | Style Exaggeration | number | 0 | 0 | 1 | – | |
speed | Speed | number | 1 | 0.7 | 1.2 | – | <1 slows; >1 speeds up |
previous_text | Previous Text | string | – | – | – | – | optional context |
next_text | Next Text | string | – | – | – | – | optional context |
language_code | Language Code | select | "" | – | – | ISO 639‑1 codes |
ElevenLabs Multilingual v2
Model ID: model_elevenlabs-multilingual-v2
Input | Label | Type | Default | Min | Max | Allowed Values | Notes |
---|---|---|---|---|---|---|---|
text | Text | string | – | – | – | – | Required. Up to 40k characters |
voice | Voice | select | Aria | – | – | "Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill" | |
stability | Stability | number | 0.5 | 0 | 1 | – | |
similarity_boost | Similarity Boost | number | 0.5 | 0 | 1 | – | |
style | Style Exaggeration | number | 0 | 0 | 1 | – | |
speed | Speed | number | 1 | 0.7 | 1.2 | – | <1 slows; >1 speeds up |
previous_text | Previous Text | string | – | – | – | – | optional context |
next_text | Next Text | string | – | – | – | – | optional context |
language_code | Language Code | select | "" | – | – | ISO 639‑1 codes |
Google Lyria 2
Model ID: model_lyria-2
Input | Label | Type | Default | Notes |
---|---|---|---|---|
prompt | Prompt | string | – | Required. Up to 2048 chars |
negative_prompt | Negative Prompt | string | – | Excludes elements from generation |
seed | Seed | number | – | Optional reproducible seed |
Meta MusicGen
Model ID: model_meta-musicgen
Input | Label | Type | Default | Min | Max | Allowed Values | Notes |
---|---|---|---|---|---|---|---|
model_version | Model Version | select | stereo-melody-large | – | – | stereo-melody-large, stereo-large, melody-large, large | |
prompt | Prompt | string | – | – | – | – | Required if no input_audio |
input_audio | Input Audio | file | – | – | – | – | optional conditioning |
duration | Duration | number | 8 | 1 | 30 | – | seconds |
continuation | Continuation | boolean | false | – | – | – | continues from input_audio |
continuation_start | Start | number | 0 | 0 | – | – | start time (s) |
continuation_end | End | number | – | 0 | – | – | defaults to end |
multi_band_diffusion | Multi Band Diffusion | boolean | false | – | – | – | only for non-stereo models |
normalization_strategy | Normalization Strategy | select | loudness | – | – | loudness, clip, peak, rms | |
temperature | Temperature | number | 1 | – | – | – | controls diversity |
classifier_free_guidance | Guidance | number | 3 | 0 | 10 | – | higher = more faithful |
seed | Seed | number | – | – | – | – | optional RNG seed |
Minimax
Minimax Music 01
Model ID: model_minimax-music-01
Input | Label | Type | Default | Allowed Values | Notes |
---|---|---|---|---|---|
lyrics | Lyrics | string | "" | – | required. Supports newline and ## for accompaniment |
song_file | Song File | file | – | – | must be >15s |
voice_file | Voice File | file | – | – | required if lyrics given |
instrumental_file | Instrumental File | file | – | – | instrumental reference |
sample_rate | Sample Rate | number | 44100 | 16000, 24000, 32000, 44100 | |
bitrate | Bitrate | number | 256000 | 32000, 64000, 128000, 256000 |
Minimax Music 1.5
Model ID: model_minimax-music-1.5
Input | Label | Type | Default | Allowed Values | Notes |
---|---|---|---|---|---|
prompt | Prompt | string | – | – | 10-300 characters required |
lyrics | Lyrics | string | – | – | 10-600 characters |
sample_rate | Sample Rate | number | 44100 | 16000, 24000, 32000, 44100 | |
bitrate | Bitrate | number | 256000 | 32000, 64000, 128000, 256000 |
Minimax Speech 02 HD
Model ID: model_minimax-speech-02-hd
Input | Label | Type | Default | Min | Max | Allowed Values |
---|---|---|---|---|---|---|
text | Text | string | – | – | – | – |
voice_id | Voice Id | select | Wise_Woman | – | – | Wise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl |
speed | Speed | number | 1 | 0.5 | 2 | – |
volume | Volume | number | 1 | 0 | 10 | – |
pitch | Pitch | number | 0 | -12 | 12 | – |
emotion | Emotion | select | auto | – | – | auto, neutral, happy, sad, angry, fearful, disgusted, surprised |
english_normalization | English Normalization | boolean | false | – | – | – |
sample_rate | Sample Rate | number | 32000 | – | – | 8000, 16000, 22050, 24000, 32000, 44100 |
bitrate | Bitrate | number | 128000 | – | – | 32000, 64000, 128000, 256000 |
channel | Channel | select | mono | – | – | mono, stereo |
language_boost | Language Boost | select | Automatic | – | – | (list of 25 language options) |
Minimax Speech 02 Turbo
Model ID: model_minimax-speech-02-turbo
Input | Label | Type | Default | Min | Max | Allowed Values |
---|---|---|---|---|---|---|
text | Text | string | – | – | – | – |
voice_id | Voice Id | select | Wise_Woman | – | – | Wise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl |
speed | Speed | number | 1 | 0.5 | 2 | – |
volume | Volume | number | 1 | 0 | 10 | – |
pitch | Pitch | number | 0 | -12 | 12 | – |
emotion | Emotion | select | auto | – | – | auto, neutral, happy, sad, angry, fearful, disgusted, surprised |
english_normalization | English Normalization | boolean | false | – | – | – |
sample_rate | Sample Rate | number | 32000 | – | – | 8000, 16000, 22050, 24000, 32000, 44100 |
bitrate | Bitrate | number | 128000 | – | – | 32000, 64000, 128000, 256000 |
channel | Channel | select | mono | – | – | mono, stereo |
language_boost | Language Boost | select | Automatic | – | – | (list of 25 language options) |
MM Audio
Model ID: model_mm-audio
Input | Label | Type | Default | Min | Max | Notes |
---|---|---|---|---|---|---|
prompt | Prompt | string | – | – | – | required |
video | Video | file | – | – | – | required |
negative_prompt | Negative Prompt | string | – | – | – | – |
duration | Duration | number | 8 | 1 | 30 | seconds |
num_steps | Steps | number | 25 | 4 | 50 | – |
cfg_strength | Guidance | number | 4.5 | 1 | 20 | higher = closer to prompt |
seed | Seed | number | – | -1 | – | -1 or blank = random |
Resemble AI Chatterbox
Model ID: model_resemble-ai-chatterbox
Input | Label | Type | Default | Min | Max | Notes |
---|---|---|---|---|---|---|
prompt | Text | string | – | – | – | required |
audio_prompt | Voice Cloning Reference Audio File | file | – | – | – | optional reference |
exaggeration | Exaggeration | number | 0.5 | 0.25 | 2 | controls emotion strength |
cfg_weight | Pace Weight | number | 0.5 | 0.2 | 1 | higher = monotone |
temperature | Temperature | number | 0.8 | 0.05 | 5 | randomness |
seed | Seed | number | 0 | – | – | 0=random |
Updated about 4 hours ago