Skip to main content
POST
/
kling
/
v1
/
videos
/
image2video
Kling Image-to-Video Task (kling-v3)
curl --request POST \
  --url https://api.powertokens.ai/kling/v1/videos/image2video \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_name": "kling-v3",
  "image": "https://example.com/frame.png",
  "prompt": "Slow camera push-in",
  "duration": "5"
}
'
{
  "code": 123,
  "message": "<string>",
  "data": {
    "task_id": "<string>",
    "task_status": "submitted",
    "task_info": {
      "external_task_id": "<string>"
    },
    "created_at": 123,
    "updated_at": 123
  },
  "request_id": "<string>"
}

Authorizations

Authorization
string
header
required

Pass Authorization: Bearer <token> in the request header.

Body

application/json
model_name
enum<string>
required

Model name

Available options:
kling-v3
image
string

Reference image

  • Supports image Base64 encoding or image URL (must be accessible)

Base64 encoding notes: Please ensure all image data parameters use Base64 encoding format. When using Base64, do not add any prefix such as data:image/png;base64, — provide only the Base64 encoded string itself.

Correct example:

iVBORw0KGgoAAAANSUhEUgAAAAUA

Incorrect example:

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA...

  • Supported image formats: .jpg / .jpeg / .png
  • Image file size must not exceed 10MB, image dimensions must be at least 300px, aspect ratio between 1:2.5 ~ 2.5:1
  • At least one of image or image_tail must be provided, both cannot be empty
image_tail
string

Reference image - tail frame control

  • Supports image Base64 encoding or image URL (must be accessible)

Base64 encoding notes: Please ensure all image data parameters use Base64 encoding format. When using Base64, do not add any prefix such as data:image/png;base64, — provide only the Base64 encoded string itself.

Correct example:

iVBORw0KGgoAAAANSUhEUgAAAAUA

Incorrect example:

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA...

  • Supported image formats: .jpg / .jpeg / .png
  • Image file size must not exceed 10MB, image dimensions must be at least 300px, aspect ratio between 1:2.5 ~ 2.5:1
  • At least one of image or image_tail must be provided, both cannot be empty
  • image_tail, dynamic_masks/static_mask, and camera_control are mutually exclusive; cannot be used together
multi_shot
boolean
default:false

Whether to generate a multi-shot video. When this parameter is true, the prompt parameter is invalid. When this parameter is false, the shot_type and multi_prompt parameters are invalid.

shot_type
enum<string>

Shot type Required when multi_shot is true.

Available options:
customize,
intelligence
prompt
string

Positive text prompt

Omni models can achieve various capabilities through prompts with subjects, images, videos, etc.:
- Use <<<>>> format to specify a subject, image, or video, e.g.: <<<element_1>>>, <<<image_1>>>, <<<video_1>>>
  • Use <<<voice_1>>> to specify a voice, index matching the order in voice_list
  • Up to 2 voices per video generation task; when specifying voices, sound must be 'on'
  • Keep syntax simple, e.g.: A man<<<voice_1>>>says: "Hello"
  • When voice_list is not empty and prompt references a voice ID, billing is based on 'with specified voice'
Maximum string length: 2500
multi_prompt
object[]

Multi-shot information including prompt, duration, etc. Define shot index and corresponding prompt and duration via index, prompt, duration parameters.

  • Max 6 shots, min 1 shot.
  • Max length per shot content: 512 characters.
  • Each shot duration must not exceed total task duration and must be >= 1.
  • Sum of all shot durations must equal total task duration.

Required when multi_shot is true and shot_type is customize. Format:

"multi_prompt":[{ "index": int, "prompt": "string", "duration": "5" },{ "index": int, "prompt": "string","duration": "5" }]

Required when multi_shot is true and shot_type is customize.

negative_prompt
string

Negative text prompt

  • Max 2500 characters
  • Supplement negative prompt information through negative sentences in the positive prompt
Maximum string length: 2500
element_list
object[]

Element reference list, configured based on element IDs in the element library.

Use key:value format:

"element_list":[{ "element_id": long },{ "element_id": long }]
voice_list
object[]

Voice list referenced during video generation.

  • Up to 2 voices per video generation task
  • When voice_list is not empty and prompt references a voice ID, billing is based on 'with specified voice'
  • element_list and voice_list are mutually exclusive, cannot coexist

Use key:value format:

"voice_list":[{ "voice_id": "string" },{ "voice_id": "string" }]
sound
enum<string>
default:off

Whether to also generate audio when creating the video

Available options:
on,
off
cfg_scale
number
default:0.5

Creative freedom of video generation; higher values result in less creative freedom

Required range: 0 <= x <= 1
mode
enum<string>
default:std

Video generation mode

  • std: Standard mode, cost-effective, output video resolution 720P.
  • pro: Expert mode (high quality), better video quality, output video resolution 1080P.
  • 4k: 4K mode, high quality (same as pro), better video quality, output video resolution 4K.
Available options:
std,
pro,
4k
static_mask
string

Static brush mask area (mask image painted by the user via motion brush)\nThe "motion brush" feature includes both "dynamic brush dynamic_masks" and "static brush static_mask"

  • Supports image Base64 encoding or image URL (must be accessible, same format requirements as the image field)
  • Supported image formats: .jpg / .jpeg / .png
  • Image aspect ratio must match the input image (i.e., the image field), otherwise the task will fail
  • The resolution of static_mask and dynamic_masks.mask must be identical, otherwise the task will fail
dynamic_masks
object[]

Dynamic brush configuration list

  • Can configure multiple groups (up to 6), each containing a "mask area" and "motion trajectories" sequence
camera_control
object

Camera motion control settings (if not specified, the model will intelligently match based on input text/image)

duration
enum<string>
default:5

Video duration in seconds.

Available options:
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15
watermark_info
object

Whether to also generate a watermarked result.

callback_url
string

Callback notification URL for this task. If configured, the server will proactively notify when the task status changes.

external_task_id
string

Custom task ID

Response

Task accepted.

code
integer
required

Kling response code. 0 indicates the task has been accepted.

message
string
required

Kling response message.

data
object
required
request_id
string

Upstream request identifier (if provided).