Academia

This page is auto-generated from model configurations. Last updated: 2026-03-13.

This reference lists all available Academia video generation models and their parameters. Use these parameter names when calling the Generation API.


MM Audio

Model ID: model_mm-audio

Capabilities: video2video

LLM Markdown: https://app.scenario.com/api/models/model_mm-audio/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
promptstringYes----Text prompt for generated audio
videofileYes----Video file for video-to-audio generation
negativePromptstringNo----Negative prompt to avoid certain sounds
durationnumberNo30130-Output duration in seconds. If this value exceeds the video's length, the video's full duration will be used instead.
numStepsnumberNo25450-The number of steps to generate the audio for
cfgStrengthnumberNo4.5120-Higher values will keep output closer to the prompt
seednumberNo--1--Random seed. Use -1 or leave blank to randomize the seed

MM Audio 2

MMAudio generates synchronized audio given video and text prompts. It can be combined with video models to get videos with audio.

Model ID: model_mm-audio-2

Capabilities: video2video

LLM Markdown: https://app.scenario.com/api/models/model_mm-audio-2/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
promptstringYes----Text prompt for generated audio
videofileYes----Video to generate the audio for.
negativePromptstringNo----Negative prompt to avoid certain sounds
durationnumberNo8130-Output duration in seconds.
numStepsnumberNo25450-The number of steps to generate the audio for
cfgStrengthnumberNo4.5120-Higher values will keep output closer to the prompt
maskAwayClipbooleanNofalse---Mask away certain sounds in the audio
seednumberNo-065535-Random seed for reproducible generation

Video Foreground Extractor - BiRefNet v2

BiRefNet is a model that cleanly separates the main subject from the background in a video, producing clear and detailed cutouts.

Model ID: model_birefnet-v2-video

Capabilities: video2video

LLM Markdown: https://app.scenario.com/api/models/model_birefnet-v2-video/markdown

ParameterTypeRequiredDefaultMinMaxAllowed ValuesDescription
videofileYes----Video to remove background from
modelstringNoGeneral Use (Light)--General Use (Light), General Use (Light 2K), General Use (Heavy), Matting, PortraitModel to use for background removal. The 'General Use (Light)' model is the original model used in the BiRefNet repository. The 'General Use (Light)' model is the original model used in the BiRefNet repository but trained with 2K images. The 'General Use (Heavy)' model is a slower but more accurate model. The 'Matting' model is a model trained specifically for matting images. The 'Portrait' model is a model trained specifically for portrait images. The 'General Use (Light)' model is recommended for most use cases.
operatingResolutionstringNo1024x1024--1024x1024, 2048x2048The resolution to operate on. The higher the resolution, the more accurate the output will be for high res input images.
outputMaskbooleanNofalse---Whether to output the mask used to remove the background
refineForegroundbooleanNotrue---Refine the foreground for better results
videoQualitystringNohigh--low, medium, high, maximumVideo output quality
videoWriteModestringNobalanced--fast, balanced, smallVideo write mode for encoding