curl --request POST \
--url https://api.powertokens.ai/v1/videos \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"model": "wan2.7-i2v",
"prompt": "The portrait gradually starts speaking with subtle head motion.",
"media": [
{
"type": "first_frame",
"url": "https://example.com/assets/portrait.png"
},
{
"type": "driving_audio",
"url": "https://example.com/assets/voice.mp3"
}
],
"seconds": "8",
"size": "720P"
}
'{
"id": "<string>",
"task_id": "<string>",
"object": "video",
"model": "<string>",
"status": "queued",
"progress": 123,
"created_at": 123
}Submit an asynchronous Ali wan2.7-i2v generation task.
The project supports two input styles:
image for first-frame generation, or images with exactly 2 items for first-frame plus last-frame generation.media, which maps to the upstream input.media protocol.Implemented official media.type values are first_frame, last_frame, driving_audio, and first_clip.
Supported media combinations in the current project implementation:
first_framefirst_frame + driving_audiofirst_frame + last_framefirst_frame + last_frame + driving_audiofirst_clipfirst_clip + last_frameprompt is optional for this channel capability. size supports only 720P and 1080P. seconds supports integer strings from 2 to 15.
curl --request POST \
--url https://api.powertokens.ai/v1/videos \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"model": "wan2.7-i2v",
"prompt": "The portrait gradually starts speaking with subtle head motion.",
"media": [
{
"type": "first_frame",
"url": "https://example.com/assets/portrait.png"
},
{
"type": "driving_audio",
"url": "https://example.com/assets/voice.mp3"
}
],
"seconds": "8",
"size": "720P"
}
'{
"id": "<string>",
"task_id": "<string>",
"object": "video",
"model": "<string>",
"status": "queued",
"progress": 123,
"created_at": 123
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Ali wan2.7-i2v model.
wan2.7-i2v Compatibility field for single first-frame input. The project maps it to upstream media: [{type: first_frame, url: ...}].
Optional text prompt. When omitted, generation relies entirely on the provided media inputs.
Compatibility field for first-frame plus last-frame input. Must contain exactly 2 items in order: [first_frame, last_frame].
2 elementsLatest multimodal input style for wan2.7-i2v. Use this field for official upstream combinations such as first frame + driving audio or first clip continuation.
Show child attributes
Video duration in string form. Supported values are integer strings from 2 to 15.
Output resolution tier for the current project implementation.
720P, 1080P Submission successful. Returns the unified OpenAI-style video task object.
Public task ID.
Public task ID alias.
"video"
Model name.
Unified video task status.
queued, in_progress, completed, failed, unknown Task progress percentage.
Unix timestamp in seconds.