Speech Models - Parameters Reference
This document provides a comprehensive reference for the parameters available across various audio generation models in the Scenario API. Each model has a unique modelId and a set of specific parameters that can be used to control the speech generation process. Understanding these parameters is crucial for effectively utilizing the API to achieve desired audio outputs.
Below, you will find detailed information for each audio model, including its modelId, the types of parameters it accepts, allowed values, default settings, and a clear description of each parameter's function.
ElevenLabs
ElevenLabs V3
Model ID: model_elevenlabs-tts-v3
| Input | Label | Type | Default | Min | Max | Allowed Values | Notes |
|---|---|---|---|---|---|---|---|
text | Text | string | – | – | – | – | Required. Up to 40k characters |
voice | Voice | select | Aria | – | – | "Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill" | |
stability | Stability | number | 0.5 | 0 | 1 | – | |
similarityBoost | Similarity Boost | number | 0.5 | 0 | 1 | – | |
style | Style Exaggeration | number | 0 | 0 | 1 | – | |
speed | Speed | number | 1 | 0.7 | 1.2 | – | <1 slows; >1 speeds up |
previousText | Previous Text | string | – | – | – | – | optional context |
nextText | Next Text | string | – | – | – | – | optional context |
languageCode | Language Code | select | "" | – | – | ISO 639‑1 codes |
ElevenLabs Turbo v2.5
Model ID: elevenlabs-turbo-v2-5
| Input | Label | Type | Default | Min | Max | Allowed Values | Notes |
|---|---|---|---|---|---|---|---|
text | Text | string | – | – | – | – | Required. Text to convert to speech (max 40000 chars) |
voice | Voice | string | Aria | – | – | Aria, Roger, Sarah, Laura, Charlie, George, Callum, River, Liam, Charlotte, Alice, Matilda, Will, Jessica, Eric, Chris, Brian, Daniel, Lily, Bill | Voice preset |
stability | Stability | number | 0.5 | 0 | 1 | – | Controls voice stability |
similarityBoost | Similarity Boost | number | 0.5 | 0 | 1 | – | Closeness to selected voice |
styleExaggeration | Style Exaggeration | number | 0 | 0 | 1 | – | Boosts emotional expression |
speed | Speed | number | 1 | 0.7 | 1.2 | – | <1 slows, >1 speeds up |
previousText | Previous Text | string | – | – | – | – | Optional. Helps continuity across multi-part generation (max 10000 chars) |
nextText | Next Text | string | – | – | – | – | Optional. Helps continuity (max 10000 chars) |
languageCode | Language Code | string | "" | – | – | "" (auto), en, ca, es, fr, de, it, ja, ko, zh, ru, ar, hi, bn, pa, ta, te, mr, ur, fa, tr, nl, sv, da, no, fi, el, ro, hu, cs, sk, sl, pt, id, th, vi, ms, tl, yo, ig, ha, am, az, be, bg, hr | Forces language for synthesis |
ElevenLabs Multilingual v2
Model ID: model_elevenlabs-multilingual-v2
| Input | Label | Type | Default | Min | Max | Allowed Values | Notes |
|---|---|---|---|---|---|---|---|
text | Text | string | – | – | – | – | Required. Up to 40k characters |
voice | Voice | select | Aria | – | – | "Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill" | |
stability | Stability | number | 0.5 | 0 | 1 | – | |
similarityBoost | Similarity Boost | number | 0.5 | 0 | 1 | – | |
style | Style Exaggeration | number | 0 | 0 | 1 | – | |
speed | Speed | number | 1 | 0.7 | 1.2 | – | <1 slows; >1 speeds up |
previousText | Previous Text | string | – | – | – | – | optional context |
nextText | Next Text | string | – | – | – | – | optional context |
languageCode | Language Code | select | "" | – | – | ISO 639‑1 codes |
Minimax
Minimax Speech 2.6 HD
Model ID: model_minimax-speech-2-6-hd
| Input | Label | Type | Default | Min | Max | Allowed Values |
|---|---|---|---|---|---|---|
text | Text | string | – | – | – | – |
voiceId | Voice Id | select | Wise_Woman | – | – | Wise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl |
speed | Speed | number | 1 | 0.5 | 2 | – |
volume | Volume | number | 1 | 0 | 10 | – |
pitch | Pitch | number | 0 | -12 | 12 | – |
emotion | Emotion | select | auto | – | – | auto, neutral, happy, sad, angry, fearful, disgusted, surprised |
englishNormalization | English Normalization | boolean | false | – | – | – |
sampleRate | Sample Rate | number | 32000 | – | – | 8000, 16000, 22050, 24000, 32000, 44100 |
bitrate | Bitrate | number | 128000 | – | – | 32000, 64000, 128000, 256000 |
channel | Channel | select | mono | – | – | mono, stereo |
languageBoost | Language Boost | select | Automatic | – | – | (list of 25 language options) |
Minimax Speech 2.6 Turbo
Model ID: model_minimax-speech-2-6-turbo
| Input | Label | Type | Default | Min | Max | Allowed Values |
|---|---|---|---|---|---|---|
text | Text | string | – | – | – | – |
voiceId | Voice Id | select | Wise_Woman | – | – | Wise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl |
speed | Speed | number | 1 | 0.5 | 2 | – |
volume | Volume | number | 1 | 0 | 10 | – |
pitch | Pitch | number | 0 | -12 | 12 | – |
emotion | Emotion | select | auto | – | – | auto, neutral, happy, sad, angry, fearful, disgusted, surprised |
englishNormalization | English Normalization | boolean | false | – | – | – |
sampleRate | Sample Rate | number | 32000 | – | – | 8000, 16000, 22050, 24000, 32000, 44100 |
bitrate | Bitrate | number | 128000 | – | – | 32000, 64000, 128000, 256000 |
channel | Channel | select | mono | – | – | mono, stereo |
languageBoost | Language Boost | select | Automatic | – | – | (list of 25 language options) |
Updated about 4 hours ago