Audio Generation
The Scenario API extends its capabilities beyond visual content to include dynamic audio generation. This advanced functionality often involves longer processing times and utilizes a flexible “custom” endpoint, requiring a job polling mechanism to retrieve the final results. This guide will explain how to initiate audio generation, monitor its progress, and retrieve the completed assets.
The Custom Endpoint
Section titled “The Custom Endpoint”For specialized generation tasks like audio, the Scenario API provides a versatile custom endpoint. This endpoint allows you to interact with specific models tailored for these complex outputs.
POST https://api.cloud.scenario.com/v1/generate/custom/{modelId} - API Reference
Where {modelId} is the identifier for the specific audio generation model you wish to use (e.g., model_elevenlabs-tts-v3 for the ElevenLabs v3 audio model). Available model IDs are here: Audio Models - Parameters Reference
Request Body for Custom Generation
Section titled “Request Body for Custom Generation”The payload for the custom endpoint will vary depending on the modelId and the type of generation (audio). However, common parameters often include:
| Parameter | Type | Description |
|---|---|---|
lyrics | string | A textual description or lyrics to guide the audio generation. |
songFile | file (audio) | An audio file to be used as a reference for the generated song. |
voiceFile | file (audio) | An audio file to be used as a voice reference for vocal generation. |
instrumentalFile | file (audio) | An audio file to be used as an instrumental reference for track generation. |
sampleRate | number | The sample rate for the generated audio. |
bitrate | number | The bitrate for the generated audio. |
Job Polling: Retrieving Asynchronous Results
Section titled “Job Polling: Retrieving Asynchronous Results”Unlike instant image generation, audio generation is an asynchronous process. This means that when you make a request to the custom endpoint, you will receive a jobId immediately, but the actual generation will happen in the background. You then need to periodically poll a separate endpoint to check the status and retrieve the final asset.
GET https://api.cloud.scenario.com/v1/jobs/{jobId} - API Reference
The response from the polling endpoint will contain the current status of your job. You should continue polling until the status field indicates success or failed.
{ "job": { "jobId": "job_abc123def456", "status": "processing", // Can be "queued", "processing", "success", "failed", "canceled" "progress": 0.5, // Optional: progress percentage "metadata": { "assetIds": [] } }, "creativeUnitsCost": 5}Once the status is success, the metadata.assetIds field will contain the IDs to your generated audio assets.
Code Examples: Audio Generation with Minimax Music 01
Section titled “Code Examples: Audio Generation with Minimax Music 01”This example demonstrates initiating an audio generation and then polling for the result using the minimax/music-01 model.
Initial Request (cURL)
Section titled “Initial Request (cURL)”curl -X POST \ -u "YOUR_API_KEY:YOUR_API_SECRET" \ -H "Content-Type: application/json" \ -d '{ "lyrics": "## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.", "sampleRate": 44100, "bitrate": 256000 }' \ https://api.cloud.scenario.com/v1/generate/custom/minimax/music-01?projectId=yourprojectidPython
Section titled “Python”from scenario_sdk import Scenarioimport time
client = Scenario( api_key="YOUR_API_KEY", api_secret="YOUR_API_SECRET",)
# Step 1: Initiate Audio Generationcustom_audio_model_id = "minimax/music-01"
print("Initiating audio generation...")response = client.generate.run_model( custom_audio_model_id, body={ "lyrics": "## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.", "sampleRate": 44100, "bitrate": 256000, },)
job_id = response.job.job_idprint(f"Audio generation job initiated. Job ID: {job_id}")
# Step 2: Poll for Job Status (or use response.job.wait() for a simpler approach)status = "queued"while status not in ["success", "failure", "canceled"]: print(f"Polling job {job_id}... Current status: {status}") time.sleep(3)
poll = client.jobs.retrieve(job_id) status = poll.job.status progress = (poll.job.progress or 0) * 100 print(f"Progress: {progress:.2f}%")
if status == "success": asset_ids = poll.job.metadata.get("assetIds", []) print(f"Audio generation completed! Asset IDs: {asset_ids}") elif status in ["failure", "canceled"]: print(f"Audio generation failed or canceled: {poll.job.error}")Node.js
Section titled “Node.js”import Scenario from '@scenario-labs/sdk';
const client = new Scenario({ apiKey: 'YOUR_API_KEY', apiSecret: 'YOUR_API_SECRET',});
async function generateAudio() { const customAudioModelId = 'minimax/music-01';
console.log('Initiating audio generation...');
// Step 1: Initiate Audio Generation const response = await client.generate.runModel(customAudioModelId, { body: { lyrics: '## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.', sampleRate: 44100, bitrate: 256000, }, });
const jobId = response.job.jobId; console.log(`Audio generation job initiated. Job ID: ${jobId}`);
// Step 2: Wait for completion using the built-in .wait() helper const completed = await response.job.wait();
if (completed.status === 'success') { const assetIds = completed.metadata?.assetIds || []; console.log('Audio generation completed! Asset IDs:', assetIds); } else { console.error(`Audio generation ended with status: ${completed.status}`); }}
generateAudio();