Skip to content
Get started
GENERATION API CALLS
Audio Generation

Audio Generation

The Scenario API extends its capabilities beyond visual content to include dynamic audio generation. This advanced functionality often involves longer processing times and utilizes a flexible “custom” endpoint, requiring a job polling mechanism to retrieve the final results. This guide will explain how to initiate audio generation, monitor its progress, and retrieve the completed assets.

For specialized generation tasks like audio, the Scenario API provides a versatile custom endpoint. This endpoint allows you to interact with specific models tailored for these complex outputs.

POST https://api.cloud.scenario.com/v1/generate/custom/{modelId} - API Reference

Where {modelId} is the identifier for the specific audio generation model you wish to use (e.g., model_elevenlabs-tts-v3 for the ElevenLabs v3 audio model). Available model IDs are here: Audio Models - Parameters Reference

The payload for the custom endpoint will vary depending on the modelId and the type of generation (audio). However, common parameters often include:

ParameterTypeDescription
lyricsstringA textual description or lyrics to guide the audio generation.
songFilefile (audio)An audio file to be used as a reference for the generated song.
voiceFilefile (audio)An audio file to be used as a voice reference for vocal generation.
instrumentalFilefile (audio)An audio file to be used as an instrumental reference for track generation.
sampleRatenumberThe sample rate for the generated audio.
bitratenumberThe bitrate for the generated audio.

Job Polling: Retrieving Asynchronous Results

Section titled “Job Polling: Retrieving Asynchronous Results”

Unlike instant image generation, audio generation is an asynchronous process. This means that when you make a request to the custom endpoint, you will receive a jobId immediately, but the actual generation will happen in the background. You then need to periodically poll a separate endpoint to check the status and retrieve the final asset.

GET https://api.cloud.scenario.com/v1/jobs/{jobId} - API Reference

The response from the polling endpoint will contain the current status of your job. You should continue polling until the status field indicates success or failed.

{
"job": {
"jobId": "job_abc123def456",
"status": "processing", // Can be "queued", "processing", "success", "failed", "canceled"
"progress": 0.5, // Optional: progress percentage
"metadata": {
"assetIds": []
}
},
"creativeUnitsCost": 5
}

Once the status is success, the metadata.assetIds field will contain the IDs to your generated audio assets.

Code Examples: Audio Generation with Minimax Music 01

Section titled “Code Examples: Audio Generation with Minimax Music 01”

This example demonstrates initiating an audio generation and then polling for the result using the minimax/music-01 model.

Terminal window
curl -X POST \
-u "YOUR_API_KEY:YOUR_API_SECRET" \
-H "Content-Type: application/json" \
-d '{
"lyrics": "## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.",
"sampleRate": 44100,
"bitrate": 256000
}' \
https://api.cloud.scenario.com/v1/generate/custom/minimax/music-01?projectId=yourprojectid
from scenario_sdk import Scenario
import time
client = Scenario(
api_key="YOUR_API_KEY",
api_secret="YOUR_API_SECRET",
)
# Step 1: Initiate Audio Generation
custom_audio_model_id = "minimax/music-01"
print("Initiating audio generation...")
response = client.generate.run_model(
custom_audio_model_id,
body={
"lyrics": "## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.",
"sampleRate": 44100,
"bitrate": 256000,
},
)
job_id = response.job.job_id
print(f"Audio generation job initiated. Job ID: {job_id}")
# Step 2: Poll for Job Status (or use response.job.wait() for a simpler approach)
status = "queued"
while status not in ["success", "failure", "canceled"]:
print(f"Polling job {job_id}... Current status: {status}")
time.sleep(3)
poll = client.jobs.retrieve(job_id)
status = poll.job.status
progress = (poll.job.progress or 0) * 100
print(f"Progress: {progress:.2f}%")
if status == "success":
asset_ids = poll.job.metadata.get("assetIds", [])
print(f"Audio generation completed! Asset IDs: {asset_ids}")
elif status in ["failure", "canceled"]:
print(f"Audio generation failed or canceled: {poll.job.error}")
import Scenario from '@scenario-labs/sdk';
const client = new Scenario({
apiKey: 'YOUR_API_KEY',
apiSecret: 'YOUR_API_SECRET',
});
async function generateAudio() {
const customAudioModelId = 'minimax/music-01';
console.log('Initiating audio generation...');
// Step 1: Initiate Audio Generation
const response = await client.generate.runModel(customAudioModelId, {
body: {
lyrics:
'## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.',
sampleRate: 44100,
bitrate: 256000,
},
});
const jobId = response.job.jobId;
console.log(`Audio generation job initiated. Job ID: ${jobId}`);
// Step 2: Wait for completion using the built-in .wait() helper
const completed = await response.job.wait();
if (completed.status === 'success') {
const assetIds = completed.metadata?.assetIds || [];
console.log('Audio generation completed! Asset IDs:', assetIds);
} else {
console.error(`Audio generation ended with status: ${completed.status}`);
}
}
generateAudio();

👉 Download OpenAPI spec