🕹️ ControlNet Modalities

ControlNet is an advanced feature that provides fine-grained control over the image generation process. It allows you to guide the AI with specific structural or compositional information extracted from an input image. This is particularly useful when you need to maintain the pose of a character, the depth of a scene, or the edges of an object while generating new content.

Common ControlNet Modalities:

Canny: Guides generation based on the edges detected in the input image.
Pose: Controls the pose of human figures based on a skeleton extracted from the input image.
Depth: Influences the depth perception and spatial arrangement of objects in the generated image.
Normal Map: Provides information about the surface orientation, useful for maintaining lighting and texture details.

How to Use ControlNet:

To use ControlNet, you typically provide a controlImage (the image from which the structural information is extracted) and specify the modality (e.g., canny, pose). The AI then uses this information in conjunction with your prompt to generate an image that adheres to both the textual description and the structural guidance.

Example: Maintaining a Character Pose

Suppose you have an image of a person in a specific pose, and you want to generate a new image of a different character in the exact same pose. You would use the original image as the controlImage with the openpose modality.

cURL

curl -X POST \
  -u "YOUR_API_KEY:YOUR_API_SECRET" \
  -H "Content-Type: application/json" \
  -d 
'{
    "prompt": "a superhero in a dynamic pose, comic book style",
    "imageId": "yourPoseImageID", # ID of your input pose image
    "modality": "pose:0.5", # includes the strength of the control
    "numSamples": 1,
    "guidance": 3.5,
    "numInferenceSteps": 28,
    "modelId": "flux.1-dev"
}' \
  https://api.scenario.com/v1/generate/controlnet

Python

import requests

api_key = "YOUR_API_KEY"
api_secret = "YOUR_API_SECRET"

url = "https://api.scenario.com/v1/generate/image"
headers = {"Content-Type": "application/json"}

payload = {
    "prompt": "a superhero in a dynamic pose, comic book style",
    "imageId": "yourPoseImageID", # ID of your input pose image
    "modality": "pose:0.5", # includes the strength of the control
    "numSamples": 1,
    "guidance": 3.5,
    "numInferenceSteps": 28,
    "modelId": "flux.1-dev"
}

response = requests.post(url, headers=headers, json=payload, auth=(api_key, api_secret))

# here you need to handle the jopb progress

if response.status_code == 200:
    data = response.json()
    print("Image generation with ControlNet successful!")
    print("Generated Image ID:", data["metadata"]["assetIds"][0])
else:
    print(f"Error: {response.status_code} - {response.text}")