Academia

This page is auto-generated from model configurations. Last updated: 2026-03-13.

This reference lists all available Academia audio generation models and their parameters. Use these parameter names when calling the Generation API.


Lux TTS

High-quality voice cloning TTS at 48kHz from text and a reference audio clip.

Model ID: model_lux-tts

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_lux-tts/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
promptstringYes----Text to convert to speech.
audiofileYes----Reference audio for voice cloning.
guidanceScalenumberNo3010-Higher values increase adherence to the reference voice.
numInferenceStepsnumberNo4116-Number of flow-matching inference steps.
maxRefLengthnumberNo5115-Maximum reference audio duration used for voice encoding (seconds).
seednumberNo-02147483647-Seed for reproducible outputs.

MM Audio 2 Text To Audio

MMAudio generates synchronized audio given text inputs. It can generate sounds described by a prompt.

Model ID: model_mm-audio-2-t2a

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_mm-audio-2-t2a/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
promptstringYes----Text prompt for generated audio
negativePromptstringNo----Negative prompt to avoid certain sounds
durationnumberNo8130-Output duration in seconds.
numStepsnumberNo25450-The number of steps to generate the audio for
cfgStrengthnumberNo4.5120-Higher values will keep output closer to the prompt
maskAwayClipbooleanNofalse---Mask away certain sounds in the audio
seednumberNo-065535-Random seed for reproducible generation

Tada 1B Text to Speech

Lighter Tada voice cloning text-to-speech variant with multilingual support.

Model ID: model_tada-1b-text-to-speech

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_tada-1b-text-to-speech/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
audiofileYes----Reference audio for voice cloning.
promptstringYes----Text to synthesize with the reference voice.
transcriptstringNo----Transcript of the reference audio. Required for non-English references.
languagestringNoen--en, ar, ch, de, es, fr, it, ja, pl, ptLanguage used for text alignment.
numInferenceStepsnumberNo20150-Number of ODE solver steps for acoustic generation.
speedUpFactornumberNo10.52-Values > 1 speed up and values < 1 slow down speech.
temperaturenumberNo0.602-Sampling temperature for text token generation.
topPnumberNo0.901-Top-p nucleus sampling value.
repetitionPenaltynumberNo1.112-Penalty applied to repeated tokens.
acousticCfgScalenumberNo1.6010-Classifier-free guidance scale for acoustic generation.
noiseTemperaturenumberNo0.902-Temperature for diffusion noise during flow matching.
numExtraStepsnumberNo0050-Additional autoregressive steps for continuation.

Tada 3B Text to Speech

Voice cloning text-to-speech with multilingual alignment and expressive controls.

Model ID: model_tada-3b-text-to-speech

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_tada-3b-text-to-speech/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
audiofileYes----Reference audio for voice cloning.
promptstringYes----Text to synthesize with the reference voice.
transcriptstringNo----Transcript of the reference audio. Required for non-English references.
languagestringNoen--en, ar, ch, de, es, fr, it, ja, pl, ptLanguage used for text alignment.
numInferenceStepsnumberNo20150-Number of ODE solver steps for acoustic generation.
speedUpFactornumberNo10.52-Values > 1 speed up and values < 1 slow down speech.
temperaturenumberNo0.602-Sampling temperature for text token generation.
topPnumberNo0.901-Top-p nucleus sampling value.
repetitionPenaltynumberNo1.112-Penalty applied to repeated tokens.
acousticCfgScalenumberNo1.6010-Classifier-free guidance scale for acoustic generation.
noiseTemperaturenumberNo0.902-Temperature for diffusion noise during flow matching.
numExtraStepsnumberNo0050-Additional autoregressive steps for continuation.