Edit Images with Prompts

The "Edit with Prompts" feature in the Scenario API empowers you to modify images using natural language instructions. This advanced capability leverages state-of-the-art AI models to interpret your textual descriptions and apply the desired changes directly to your images. This guide will walk you through the process of using the "Edit with Prompts" API, detailing its capabilities, the underlying AI models, necessary parameters, and providing code examples to help you integrate this powerful editing tool into your applications.

🚀 Key Concepts

Prompt-Based Editing: The core idea is to describe the desired image modifications using clear and concise text prompts, rather than traditional graphical editing tools.
AI Models for Editing: The Scenario API utilizes various specialized AI models, such as Gemini 2.5 Flash also called Nano-Banana, GPT-Image, Flux.1 Kontext, Qwen Edit and Runway Gen-4. Each model has unique strengths and is suited for different types of edits:
- Nano-Banana: Ideal for precise edits, adjusting small details, or swapping specific elements while largely preserving the original image's integrity.
- GPT-Image: Excellent for more creative or broad transformations, such as changing character proportions or restyling an entire scene. Note that it might sometimes cause the original style to drift.
- Flux.1 Kontext: Offers a good balance, with less tendency for style drift compared to GPT-Image. It's faster and more cost-effective, though it might not handle highly complex transformations as effectively.
- Runway Gen-4: Generally superior in quality and consistency, especially when combining multiple input images. It excels at producing coherent and polished results with minimal style drift.
- Seedream 4: Delivers highly precise, context-aware edits that preserve the original style and composition, making it ideal for professional-grade refinements and seamless visual consistency.
- Qwen Edit: Provides a flexible editing experience with strong semantic understanding, making it effective for nuanced, language-driven modifications and stylistic adjustments while maintaining high visual fidelity.
Reference Images: Your reference images. This should be existing assets in your Scenario workspace.
Masking: For highly controlled edits, you can provide a mask to specify the exact areas of the image that should be modified. The black areas of the mask indicate regions to be replaced, while filled areas are preserved. Only available for the gpt-image-1 model. Will be ignored for other models.
Asynchronous Processing: Image editing with prompts is an asynchronous operation. Upon initiating an edit, the API returns a jobId, which you then use to poll for the status and results of your editing task.

⚡️ Editing Workflow

The general workflow for editing an image with prompts involves the following steps:

Prepare Your Images and Prompt: Identify the images you want to edit or use as reference and formulate a clear text prompt describing the desired changes.
Initiate Prompt-Based Editing: Make an API request to the dedicated endpoint, providing your image, prompt, and any additional parameters.
Monitor Job Status: Periodically check the status of your editing job until it is complete.
Retrieve Edited Image: Once the job is successful, retrieve the modified image.

Let's explore each step in detail.

1. Initiate Prompt-Based Editing

To begin editing an image with a prompt, you will make a POST request to the /v1/generate/custom/{modelId} endpoint, Where {modelId} is the identifier for the specific editing model you wish to use (e.g., model_qwen-image-editing for the Qwen Edit model). Available model IDs are here: Editing Models - Parameters Reference.

In the request body, you will specify the image to be edited, your text prompt, and optionally, additional reference images and various parameters to fine-tune the editing process.

Endpoint:

POST https://api.cloud.scenario.com/v1/generate/custom/{modelId} - API Reference

Request Body Parameters:

The payload for the custom endpoint will vary depending on the modelId. However, common parameters often include:

Parameter	Type	Description	Required
`image`	assetIds[]	The images to edit. This can be an existing `AssetId` (e.g., `"asset_GTrL3mq4SXWyMxkOHRxlpw"`) or a Data URL (e.g., `"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVQYV2NgYAAAAAMAAWgmWQ0AAAAASUVORK5CYII="`).	Yes
`prompt`	string	The natural language instruction describing the desired edits to the image. Be as specific and clear as possible.	Yes
`mask`	string	A mask image (as `AssetId` or Data URL) to indicate specific areas for editing. Black areas will be replaced, while filled areas are kept. Only available for the `gpt-image-1` model.	No
`numSamples`	number	The number of variations of the edited image to generate. The maximum number depends on your subscription tier.	No
`aspectRatio`	string	The aspect ratio of the generated image(s). Options include `"auto"`, `"1:1"`, `"3:2"`, `"2:3"`, `"4:3"`, `"3:4"`, `"9:16"`, `"16:9"`. Defaults to `"auto"`. Availability varies by `modelId`.	No
`quality`	string	The quality of the generated image(s). Options include `"high"`, `"standard"`. Defaults to `"high"`. Availability varies by `modelId`.	No
`inputFidelity`	string	When set to high, it preserves image details, ideal for faces or logos. The first image gets the finest textures, so place key elements there. Available only for `gpt-image-1`.
`seed`	number	A seed value for the random number generator to ensure reproducibility. Only available for the `flux-kontext` model.	No
`guidanceScale`	number	Controls how closely the generated image adheres to the prompt. Only available for the `flux-kontext` model.	No
`format`	string	The output format of the generated image(s). Options include `"png"`, `"jpeg"`, `"webp"`. Defaults to `"png"`. Only available for the `gpt-image-1` model.	No
`compression`	number	The compression level (0-100%) for the generated images. Only available for the `gpt-image-1` model with `webp` or `jpeg` output formats. Defaults to 100.	No

Example Request (Python):

import requests

api_key = "YOUR_API_KEY"
api_secret = "YOUR_API_SECRET"

url = "https://api.cloud.scenario.com/v1/generate/custom/model_qwen-image-editing"
headers = {"Authorization": f"Basic {requests.utils.b64encode(f'{api_key}:{api_secret}'.encode()).decode()}"}

payload = {
    "image": ["asset_your_image_id"], # Replace with your image AssetId or Data URL
    "prompt": "Change the background to a futuristic cityscape at night",
    "numSamples": 1,
    "aspectRatio": "16:9"
}

response = requests.post(url, headers=headers, json=payload)

if response.status_code == 200:
    job_data = response.json()
    job_id = job_data["job"]["jobId"]
    print(f"Image editing job launched successfully! Job ID: {job_id}")
else:
    print(f"Error launching image editing job: {response.status_code} - {response.text}")

Upon successful initiation, the API will return a jobId. You will use this jobId to monitor the progress of your image editing task.

2. Monitor Job Status

Similar to other asynchronous operations in the Scenario API, you need to poll the API to check the status of your image editing job. You will make GET requests to the /v1/jobs/{jobId} endpoint until the job reaches a final status (e.g., success, failure, or canceled).

Endpoint:

GET https://api.cloud.scenario.com/v1/jobs/{jobId}

Path Parameters:

Parameter	Type	Description	Required
`jobId`	string	The ID of the image editing job.	Yes

Example Request (Python):

import time
import requests

# Assuming job_id is obtained from the editing initiation step
# job_id = "your_job_id"

api_key = "YOUR_API_KEY"
api_secret = "YOUR_API_SECRET"

url = f"https://api.cloud.scenario.com/v1/jobs/{job_id}"
headers = {"Authorization": f"Basic {requests.utils.b64encode(f'{api_key}:{api_secret}'.encode()).decode()}"}

while True:
    response = requests.get(url, headers=headers)
    response.raise_for_status() # Raise an exception for HTTP errors
    data = response.json()
    status = data["job"]["status"]

    print(f"Job status: {status}")

    if status == "success":
        asset_ids = data["job"]["metadata"].get("assetIds", [])
        print(f"Image editing complete. Edited Asset IDs: {asset_ids}")
        break
    elif status in ["failure", "canceled"]:
        raise Exception(f"Image editing job ended with status: {status}")

    time.sleep(3) # Poll every 3 seconds

3. Retrieve Edited Image

Once the job status is success, the metadata field of the job response will contain the assetIds of the newly generated (edited) images. You can then use these assetIds to retrieve the actual image data, for example, by using the GET /v1/assets/{assetId} endpoint if available, or by constructing a direct download URL if the API provides one.

Example of successful job response (relevant part):

{
  "job": {
    "jobId": "job_editing_example",
    "status": "success",
    "metadata": {
      "assetIds": [
        "asset_edited_image_1",
        "asset_edited_image_2"
      ]
    }
    // ... other job details
  }
}

📚 References

Scenario Help Article: Edit with Prompts

👉 Download OpenAPI spec