Audio Generation

The Scenario API extends its capabilities beyond visual content to include dynamic audio generation. This advanced functionality often involves longer processing times and utilizes a flexible "custom" endpoint, requiring a job polling mechanism to retrieve the final results. This guide will explain how to initiate audio generation, monitor its progress, and retrieve the completed assets.

The Custom Endpoint

For specialized generation tasks like audio, the Scenario API provides a versatile custom endpoint. This endpoint allows you to interact with specific models tailored for these complex outputs.

POST https://api.cloud.scenario.com/v1/generate/custom/{modelId} - API Reference

Where {modelId} is the identifier for the specific audio generation model you wish to use (e.g., model_music-01 for the Minimax Music 01 audio model). Available model IDs are here: Audio Models - Parameters Reference

Request Body for Custom Generation

The payload for the custom endpoint will vary depending on the modelId and the type of generation (audio). However, common parameters often include:

ParameterTypeDescription
lyricsstringA textual description or lyrics to guide the audio generation.
songFilefile (audio)An audio file to be used as a reference for the generated song.
voiceFilefile (audio)An audio file to be used as a voice reference for vocal generation.
instrumentalFilefile (audio)An audio file to be used as an instrumental reference for track generation.
sampleRatenumberThe sample rate for the generated audio.
bitratenumberThe bitrate for the generated audio.

Note: To find the correct list of parameters for a specific model, you can refer to the Audio Models - Parameters Reference or consult the API documentation.

Job Polling: Retrieving Asynchronous Results

Unlike instant image generation, audio generation is an asynchronous process. This means that when you make a request to the custom endpoint, you will receive a jobId immediately, but the actual generation will happen in the background. You then need to periodically poll a separate endpoint to check the status and retrieve the final asset.

GET https://api.cloud.scenario.com/v1/jobs/{jobId} - API Reference

The response from the polling endpoint will contain the current status of your job. You should continue polling until the status field indicates success or failed.

{
  "job": {
    "jobId": "job_abc123def456",
    "status": "processing", // Can be "queued", "processing", "success", "failed", "canceled"
    "progress": 0.5, // Optional: progress percentage
    "metadata": {
      "assetIds": []
    }
  },
  "creativeUnitsCost": 5
}

Once the status is success, the metadata.assetIds field will contain the IDs to your generated audio assets.

Code Examples: Audio Generation with Minimax Music 01

This example demonstrates initiating an audio generation and then polling for the result using the minimax/music-01 model.

Initial Request (cURL)

curl -X POST \
  -u "YOUR_API_KEY:YOUR_API_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "lyrics": "## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.",
    "sampleRate": 44100,
    "bitrate": 256000
  }' \
  https://api.cloud.scenario.com/v1/generate/custom/minimax/music-01?projectId=yourprojectid

Python

import requests
import time

api_key = "YOUR_API_KEY"
api_secret = "YOUR_API_SECRET"
project_id = "yourprojectid"

# Step 1: Initiate Audio Generation
custom_audio_model_id = "minimax/music-01"
initial_url = f"https://api.cloud.scenario.com/v1/generate/custom/{custom_audio_model_id}?projectId={project_id}"
headers = {"Content-Type": "application/json"}

payload = {
    "lyrics": "## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.",
    "sampleRate": 44100,
    "bitrate": 256000
}

print("Initiating audio generation...")
initial_response = requests.post(initial_url, headers=headers, json=payload, auth=(api_key, api_secret))

if initial_response.status_code == 200:
    initial_data = initial_response.json()
    job_id = initial_data.get("job").get("jobId")
    if job_id:
        print(f"Audio generation job initiated. Job ID: {job_id}")

        # Step 2: Poll for Job Status
        polling_url = f"https://api.scenario.com/v1/jobs/{job_id}"
        status = "queued"
        while status not in ["success", "failure", "canceled"]:
            print(f"Polling job {job_id}... Current status: {status}")
            time.sleep(3) # Wait for 3 seconds before polling again

            polling_response = requests.get(polling_url, auth=(api_key, api_secret))
            if polling_response.status_code == 200:
                polling_data = polling_response.json()
                status = polling_data.get("job").get("status")
                progress = polling_data.get("job").get("progress", 0) * 100
                print(f"Progress: {progress:.2f}%")

                if status == "success":
                    asset_ids = polling_data.get("job").get("metadata").get("assetIds", [])
                    print(f"Audio generation completed! Asset IDs: {asset_ids}")
                elif status in ["failure", "canceled"]:
                    print(f"Audio generation failed or canceled: {polling_data.get("job").get("error")}")
            else:
                print(f"Error polling job status: {polling_response.status_code} - {polling_response.text}")
                break
    else:
        print("Error: No jobId returned in the initial response.")
else:
    print(f"Error initiating audio generation: {initial_response.status_code} - {initial_response.text}")

Node.js

const fetch = require("node-fetch");

const apiKey = "YOUR_API_KEY";
const apiSecret = "YOUR_API_SECRET";
const projectId = "yourprojectid";

const credentials = Buffer.from(`${apiKey}:${apiSecret}`).toString("base64");

async function generateAudio() {
  const customAudioModelId = "minimax/music-01";
  const initialUrl = `https://api.cloud.scenario.com/v1/generate/custom/${customAudioModelId}?projectId=${projectId}`;
  const headers = {
    "Content-Type": "application/json",
    Authorization: `Basic ${credentials}`,
  };

  const payload = {
    lyrics: "## (Verse 1) ##\nHello, world!\nThis is my first song.\n## (Chorus) ##\nSinging loud and clear,\nFor everyone to hear.",
    sampleRate: 44100,
    bitrate: 256000,
  };

  console.log("Initiating audio generation...");
  try {
    const initialResponse = await fetch(initialUrl, {
      method: "POST",
      headers: headers,
      body: JSON.stringify(payload),
    });

    const initialData = await initialResponse.json();

    if (initialResponse.ok) {
      const jobId = initialData.job.jobId;
      if (jobId) {
        console.log(`Audio generation job initiated. Job ID: ${jobId}`);

        const pollingUrl = `https://api.scenario.com/v1/jobs/${jobId}`;
        let status = "queued";

        while (!["success", "failure", "canceled"].includes(status)) {
          console.log(`Polling job ${jobId}... Current status: ${status}`);
          await new Promise(resolve => setTimeout(resolve, 3000)); // Wait for 3 seconds

          const pollingResponse = await fetch(pollingUrl, {
            headers: { Authorization: `Basic ${credentials}` },
          });
          const pollingData = await pollingResponse.json();

          if (pollingResponse.ok) {
            status = pollingData.job.status;
            const progress = (pollingData.job.progress || 0) * 100;
            console.log(`Progress: ${progress.toFixed(2)}%`);

            if (status === "success") {
              const assetIds = pollingData.job.metadata.assetIds || [];
              console.log("Audio generation completed! Asset IDs:", assetIds);
            } else if (["failure", "canceled"].includes(status)) {
              console.error(`Audio generation failed or canceled: ${pollingData.job.error}`);
            }
          } else {
            console.error(`Error polling job status: ${pollingResponse.status} - ${JSON.stringify(pollingData)}`);
            break;
          }
        }
      } else {
        console.error("Error: No jobId returned in the initial response.");
      }
    } else {
      console.error(`Error initiating audio generation: ${initialResponse.status} - ${JSON.stringify(initialData)}`);
    }
  } catch (error) {
    console.error("An unexpected error occurred:", error);
  }
}

generateAudio();