doubao-seedance-2.0 Video Generation

curl --request POST \
  --url https://api.apimart.ai/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "doubao-seedance-2.0",
    "prompt": "A kitten yawning at the camera",
    "resolution": "720p",
    "size": "16:9",
    "duration": 5,
    "generate_audio": true
  }'

{
  "code": 200,
  "data": [
    {
      "status": "submitted",
      "task_id": "task_01KMCGF6BQGN3X28H3KSR50X5T"
    }
  ]
}

POST

videos

generations

curl --request POST \
  --url https://api.apimart.ai/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "doubao-seedance-2.0",
    "prompt": "A kitten yawning at the camera",
    "resolution": "720p",
    "size": "16:9",
    "duration": 5,
    "generate_audio": true
  }'

{
  "code": 200,
  "data": [
    {
      "status": "submitted",
      "task_id": "task_01KMCGF6BQGN3X28H3KSR50X5T"
    }
  ]
}

curl --request POST \
  --url https://api.apimart.ai/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "doubao-seedance-2.0",
    "prompt": "A kitten yawning at the camera",
    "resolution": "720p",
    "size": "16:9",
    "duration": 5,
    "generate_audio": true
  }'

{
  "code": 200,
  "data": [
    {
      "status": "submitted",
      "task_id": "task_01KMCGF6BQGN3X28H3KSR50X5T"
    }
  ]
}

Authentication

Authorization

string

required

All API endpoints require Bearer Token authenticationGet your API Key:Visit the API Key Management Page to get your API KeyAdd it to the request header:

Authorization: Bearer YOUR_API_KEY

Request Parameters

model

string

required

Video generation model nameSupported models:

doubao-seedance-2.0 - Standard version, supports text-to-video, image-to-video, first/last frame video, reference video, reference audio, and audio-enabled video
doubao-seedance-2.0-fast - Fast version, same features as the standard version with faster generation speed
doubao-seedance-2.0-face - Supports real person uploads, same features as the standard version
doubao-seedance-2.0-fast-face - Supports real person uploads, same features as the fast version
doubao-seedance-2.0-mini - Mini version, same features as the standard version

prompt

string

Video content descriptionRequired for text-to-video; optional for image-to-video or video-reference-to-videoIt is recommended to clearly specify the subject, action, camera movement, and style for better generation results

The prompt is limited to 4000 characters, but 500 characters are recommended.
The model doubao-seedance-2.0-mini has no character limit. Recommendation: keep Chinese prompts under 500 characters and English prompts under 1000 words. Excessive length tends to disperse the information, and the model may overlook details and focus only on the key points, resulting in some elements being missing from the video.

Example: "A kitten yawning at the camera"

duration

integer

default:"5"

Video duration (seconds)Supported range: 4 to 15 secondsDefault: 5

size

string

default:"16:9"

Video aspect ratioOptions:

16:9 - Landscape
9:16 - Portrait
1:1 - Square
4:3 - Traditional ratio
3:4 - Vertical traditional ratio
21:9 - Ultra-wide
adaptive - Adaptive (automatically matches the input image/video)

Default: 16:9

resolution

string

default:"720p"

Video resolutionOptions:

480p - Standard definition
720p - High definition
1080p - Full HD (only supported by doubao-seedance-2.0-face and doubao-seedance-2.0)
4k - Ultra HD (only supported by doubao-seedance-2.0)

Default: 720p

seed

integer

Random seed for controlling the randomness of generated content

With the same request, different seed values will produce different results
With the same request, the same seed value will produce similar results, but exact consistency is not guaranteed

generate_audio

boolean

default:"false"

Whether to generate audio (audio-enabled video)When set to true, the video will include AI-generated accompanying audioDefault: false

return_last_frame

boolean

default:"false"

Whether to return the last frame imageWhen set to true, the task result will additionally return the URL of the video’s last frame image, which can be used for continuous video generationDefault: false

tools

array<object>

Tool list for enhanced capabilities such as web searchExample: [{"type": "web_search"}]

Show Field Description

type

string

required

Tool typeOptions:

web_search - Web search, references online information during generation

image_urls

array<string>

Image URL array for image-to-videoSupports two formats:

Regular image URL: https://example.com/cat.jpg
Asset URL (approved asset): asset://asset_a

Example: ["https://example.com/cat.jpg"] or ["asset://asset_a"]

Asset URL is only supported by doubao-seedance-2.0 and doubao-seedance-2.0-fast models. Other models do not support it.

image_urls and image_with_roles cannot be used simultaneously
Maximum of 9 reference images

image_with_roles

array

Image array with roles, supports specifying first frame/last frame

When the url field uses an Asset URL, only doubao-seedance-2.0 and doubao-seedance-2.0-fast models are supported. Other models do not support it.

Show Field Description

url

string

required

Image URLSupports two formats:

Regular image URL: https://example.com/day.jpg
Asset URL (approved asset): asset://asset_a

Asset URL is only supported by doubao-seedance-2.0 and doubao-seedance-2.0-fast models. Other models do not support it.

role

string

required

Image roleOptions:

first_frame - First frame image, used as the video’s starting frame
last_frame - Last frame image, used as the video’s ending frame
reference_image - Reference portrait image (used with Asset URL)

Example:

[
  {"url": "https://example.com/day.jpg", "role": "first_frame"},
  {"url": "https://example.com/night.jpg", "role": "last_frame"}
]

Asset URL format:

[
  {"url": "asset://asset_a", "role": "reference_image"}
]

image_urls and image_with_roles cannot be used simultaneously
When using first/last frame images, video_urls and audio_urls are not available

video_urls

array<string>

Reference video URL arraySupports two formats:

Regular video URL: https://example.com/reference.mp4
Asset URL (approved asset): asset://asset_a

Example: ["https://example.com/reference.mp4"] or ["asset://asset_a"]

Asset URL is only supported by doubao-seedance-2.0 and doubao-seedance-2.0-fast models. Other models do not support it.

When using first/last frame images (image_with_roles), reference videos are not available
Maximum of 3 reference videos, 1.8s < total duration < 15.2s
Reference video resolution must be between 480P and 720P
Reference videos must not contain real people

audio_urls

array<string>

Reference audio URL arraySupports two formats:

Regular audio URL: https://example.com/speech.wav
Asset URL (approved asset): asset://asset_a

Example: ["https://example.com/speech.wav"] or ["asset://asset_a"]

Asset URL is only supported by doubao-seedance-2.0 and doubao-seedance-2.0-fast models. Other models do not support it.

When using first/last frame images (image_with_roles), reference audio is not available
Maximum of 3 reference audio files, total duration must be 15s or less
Reference audio must be used together with reference images or reference videos

Response

code

integer

Response status code, 200 on success

data

array

Response data array

Show Array Elements

status

string

Task status, submitted when initially submitted

task_id

string

Unique task identifier for querying task status and results

Use Cases

Case 1: Text-to-Video

{
  "model": "doubao-seedance-2.0",
  "prompt": "A kitten yawning at the camera",
  "resolution": "720p",
  "size": "16:9",
  "duration": 5,
  "seed": 42,
  "generate_audio": true
}

Case 2: Image-to-Video (First Frame)

{
  "model": "doubao-seedance-2.0",
  "prompt": "The kitten stands up and walks toward the camera",
  "image_urls": ["https://example.com/cat.jpg"],
  "duration": 5
}

Case 3: First/Last Frame Video

{
  "model": "doubao-seedance-2.0",
  "prompt": "Transition from day to night",
  "image_with_roles": [
    {"url": "https://example.com/day.jpg", "role": "first_frame"},
    {"url": "https://example.com/night.jpg", "role": "last_frame"}
  ],
  "duration": 5
}

Case 4: Video-Reference-to-Video

{
  "model": "doubao-seedance-2.0",
  "prompt": "Convert the video style to anime style",
  "video_urls": ["https://example.com/reference.mp4"]
}

Case 5: Reference Video + Reference Audio

{
  "model": "doubao-seedance-2.0",
  "prompt": "A scene of a person speaking",
  "video_urls": ["https://example.com/reference.mp4"],
  "audio_urls": ["https://example.com/speech.wav"],
  "size": "16:9",
  "duration": 11
}

Case 6: Audio-Enabled Video

{
  "model": "doubao-seedance-2.0",
  "prompt": "A man stops a woman and says: \"Remember, you must never point your finger at the moon.\"",
  "generate_audio": true
}

Case 7: Continuous Video Generation (Return Last Frame)

{
  "model": "doubao-seedance-2.0",
  "prompt": "The kitten continues walking toward the camera",
  "image_urls": ["https://example.com/last_frame_from_prev.png"],
  "return_last_frame": true
}

Case 8: Fast Version Generation

{
  "model": "doubao-seedance-2.0-fast",
  "prompt": "City nightscape timelapse photography",
  "size": "21:9",
  "duration": 8
}

Combine reference images, reference video, and reference audio to generate an immersive first-person perspective advertisement video. Ideal for product promotions, brand ads, and other scenarios requiring multi-source material fusion.

{
  "model": "doubao-seedance-2.0",
  "prompt": "Use video 1's first-person perspective throughout, and use audio 1 as the background music throughout. First-person POV fruit tea advertisement for seedance brand 'Peace Apple' apple fruit tea limited edition. First frame is image 1: your hand picks a dewy Aksu red apple with a crisp apple collision sound. 2-4s: quick cut, your hand drops apple chunks into a shaker cup, adds ice and tea base, shakes vigorously, ice collision and shaking sounds sync with upbeat drum beats, background voice: 'Fresh-cut, fresh-shaken'. 4-6s: first-person close-up of the finished product, layered fruit tea poured into a clear cup, your hand gently squeezes cream cap spreading on top, sticks a pink label on the cup, camera zooms in on the layered texture of cream cap and fruit tea. 6-8s: first-person handheld cup raise, you lift the fruit tea from image 2 toward the camera (simulating handing it to the viewer), cup label clearly visible, background voice 'Take a sip of freshness', final frame freezes on image 2. Background voice consistently uses a female tone.",
  "image_urls": [
    "https://example.com/tea_pic1.jpg",
    "https://example.com/tea_pic2.jpg"
  ],
  "video_urls": ["https://example.com/tea_video1.mp4"],
  "audio_urls": ["https://example.com/tea_audio1.mp3"],
  "generate_audio": true,
  "size": "16:9",
  "duration": 11
}

Case 10: Image-to-Video with Asset URL

Approved virtual avatar assets can be passed directly as reference images without re-uploading or re-reviewing.

{
  "model": "doubao-seedance-2.0",
  "prompt": "The character walks naturally on a city street under bright sunshine",
  "image_urls": ["asset://asset_a"],
  "duration": 5,
  "resolution": "720p"
}

Case 11: Specify Reference Portrait with Asset URL (image_with_roles)

{
  "model": "doubao-seedance-2.0",
  "prompt": "Using the reference portrait, the character walks elegantly toward the camera",
  "image_with_roles": [
    {
      "url": "asset://asset_a",
      "role": "reference_image"
    }
  ],
  "resolution": "720p",
  "duration": 5
}

Case 12: Fast Version + Asset URL Image-to-Video

{
  "model": "doubao-seedance-2.0-fast",
  "prompt": "The character strolls in a park with a gentle breeze",
  "image_urls": ["asset://asset_a"],
  "duration": 5,
  "resolution": "720p"
}

Case 13: Asset URL Image + Reference Video (Motion Transfer)

Combine an approved portrait asset with a reference video to drive the character to perform specified movements.

{
  "model": "doubao-seedance-2.0",
  "prompt": "The character dances to the rhythm of the reference video with smooth and natural movements",
  "image_urls": ["https://example.com/dance_reference.jpg", "asset://asset_a"],
  "video_urls": ["https://example.com/dance_reference.mp4", "asset://asset_a"],
  "duration": 8,
  "resolution": "720p"
}

Query Task ResultsVideo generation is an async task that returns a task_id upon submission. Use the Get Task Status endpoint to query generation progress and results.

Differences from 1.5 Pro Version

Feature	1.5 Pro	2.0 / 2.0 fast
Resolution	480p/720p/1080p	480p/720p/1080p/4k (fast only 480p/720p)
Duration range	4-12s	5-15s
Default duration	5s	5s
Aspect ratio parameter	`aspect_ratio`	`size` (new `adaptive` option)
Audio generation	`audio` parameter	`generate_audio` parameter
Reference video	Not supported	Supported via `video_urls`
Reference audio	Not supported	Supported via `audio_urls`
Image-to-video	`image_urls` / `image_with_roles`	`image_urls` / `image_with_roles`
Audio-enabled video	Not supported	Supported via `generate_audio`
Continuous video	Not supported	Supported via `return_last_frame`
Fast version	Not supported	Supported via `doubao-seedance-2.0-fast`

doubao-seedance-1-5-pro Video Generation Virtual Avatar Assets

​Authentication

​Request Parameters

​Response

​Use Cases

​Case 1: Text-to-Video

​Case 2: Image-to-Video (First Frame)

​Case 3: First/Last Frame Video

​Case 4: Video-Reference-to-Video

​Case 5: Reference Video + Reference Audio

​Case 6: Audio-Enabled Video

​Case 7: Continuous Video Generation (Return Last Frame)

​Case 8: Fast Version Generation

​Case 9: Reference Images + Reference Video + Reference Audio (Multi-Modal Video)

​Case 10: Image-to-Video with Asset URL

​Case 11: Specify Reference Portrait with Asset URL (image_with_roles)

​Case 12: Fast Version + Asset URL Image-to-Video

​Case 13: Asset URL Image + Reference Video (Motion Transfer)

​Differences from 1.5 Pro Version

Authentication

Request Parameters

Response

Use Cases

Case 1: Text-to-Video

Case 2: Image-to-Video (First Frame)

Case 3: First/Last Frame Video

Case 4: Video-Reference-to-Video

Case 5: Reference Video + Reference Audio

Case 6: Audio-Enabled Video

Case 7: Continuous Video Generation (Return Last Frame)

Case 8: Fast Version Generation

Case 9: Reference Images + Reference Video + Reference Audio (Multi-Modal Video)

Case 10: Image-to-Video with Asset URL

Case 11: Specify Reference Portrait with Asset URL (image_with_roles)

Case 12: Fast Version + Asset URL Image-to-Video

Case 13: Asset URL Image + Reference Video (Motion Transfer)

Differences from 1.5 Pro Version