Google

This page is auto-generated from model configurations. Last updated: 2026-03-26.

This reference lists all available Google audio generation models and their parameters. Use these parameter names when calling the Generation API.


Gemini 2.5 Flash TTS

Convert text to natural-sounding speech using Google's Gemini 2.5 Flash model with multiple voice presets.

Model ID: model_google-gemini-2-5-flash-tts

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_google-gemini-2-5-flash-tts/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
textstringYes----Text to convert to speech
voicestringNoPuck--Achernar, Achird, Algenib, Algieba, Alnilam, Aoede, Autonoe, Callirrhoe, Charon, Despina, Enceladus, Erinome, Fenrir, Gacrux, Iapetus, Kore, Laomedeia, Leda, Orus, Pulcherrima, Puck, Rasalgethi, Sadachbia, Sadaltager, Schedar, Sulafat, Umbriel, Vindemiatrix, Zephyr, ZubenelgenubiVoice preset to use for speech synthesis
languagestringNoen-US--ar-EG, bn-BD, de-DE, en-IN, en-US, es-US, fr-FR, hi-IN, id-ID, it-IT, ja-JP, ko-KR, mr-IN, nl-NL, pl-PL, pt-BR, ro-RO, ru-RU, ta-IN, te-IN, th-TH, tr-TR, uk-UA, vi-VNLanguage for speech synthesis (auto-detected if not specified)

Gemini 2.5 Pro TTS

Convert text to natural-sounding speech using Google's Gemini 2.5 Pro model with multiple voice presets.

Model ID: model_google-gemini-2-5-pro-tts

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_google-gemini-2-5-pro-tts/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
textstringYes----Text to convert to speech
voicestringNoPuck--Achernar, Achird, Algenib, Algieba, Alnilam, Aoede, Autonoe, Callirrhoe, Charon, Despina, Enceladus, Erinome, Fenrir, Gacrux, Iapetus, Kore, Laomedeia, Leda, Orus, Pulcherrima, Puck, Rasalgethi, Sadachbia, Sadaltager, Schedar, Sulafat, Umbriel, Vindemiatrix, Zephyr, ZubenelgenubiVoice preset to use for speech synthesis
languagestringNoen-US--ar-EG, bn-BD, de-DE, en-IN, en-US, es-US, fr-FR, hi-IN, id-ID, it-IT, ja-JP, ko-KR, mr-IN, nl-NL, pl-PL, pt-BR, ro-RO, ru-RU, ta-IN, te-IN, th-TH, tr-TR, uk-UA, vi-VNLanguage for speech synthesis (auto-detected if not specified)

Google Lyria 2

Model ID: model_lyria-2

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_lyria-2/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
promptstringYes----Text prompt for audio generation
negativePromptstringNo----Description of what to exclude from the generated audio
seednumberNo----Use a seed for reproducible results. Leave blank to use a random seed.

Google Lyria 3 Clip

Fast 30-second music clips with vocals and optional image-to-music. MP3 output.

Model ID: model_lyria-3-clip

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_lyria-3-clip/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
promptstringNo----Describe the music: genre, instruments, tempo (BPM), key, mood, and structure tags like [Verse], [Chorus], [Bridge].
imagesfile_arrayNo----Optional reference images (up to 10).

Google Lyria 3 Pro

Full-length song generation (up to ~2 minutes) with structure, vocals, and optional image-to-music. MP3 or WAV output.

Model ID: model_lyria-3-pro

Capabilities: txt2audio

LLM Markdown: https://app.scenario.com/api/models/model_lyria-3-pro/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
promptstringNo----Describe the music: genre, instruments, tempo (BPM), key, mood, duration, and structure tags like [Verse], [Chorus], [Bridge].
imagesfile_arrayNo----Optional reference images (up to 10).