Skip to main content
POST
/
v1
/
videos
/
generations
curl --request POST \
  --url https://api.apimart.ai/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "wan2.7",
    "prompt": "A coastal road at sunset, slow-motion camera push-in, cinematic feel",
    "resolution": "1080P",
    "duration": 8,
    "size": "16:9"
  }'
{
  "code": 200,
  "data": [
    {
      "status": "submitted",
      "task_id": "task_01J9HA7JPQ9A0Z6JZ3V8M9W6PZ"
    }
  ]
}
curl --request POST \
  --url https://api.apimart.ai/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "wan2.7",
    "prompt": "A coastal road at sunset, slow-motion camera push-in, cinematic feel",
    "resolution": "1080P",
    "duration": 8,
    "size": "16:9"
  }'
{
  "code": 200,
  "data": [
    {
      "status": "submitted",
      "task_id": "task_01J9HA7JPQ9A0Z6JZ3V8M9W6PZ"
    }
  ]
}

Authorization

Authorization
string
required
All API endpoints require Bearer Token authenticationGet your API Key:Visit the API Key Management Page to get your API KeyAdd it to the request header:
Authorization: Bearer YOUR_API_KEY

Mode Routing

wan2.7 is a unified entry for text-to-video and image-to-video. The backend automatically determines the mode based on the incoming parameters. Both modes are billed identically:
ConditionRoutes ToMode Description
Any of image_urls / image_with_roles / video_urls is providedImage-to-VideoFirst-frame / First-last frame / Video continuation
None of the above parameters providedText-to-VideoGenerate video purely from text description

Request Parameters

model
string
required
Video generation model name, fixed as wan2.7
prompt
string
Video content description, up to 5000 characters
  • Text-to-Video mode (when no image/video provided): required
  • Image-to-Video mode: optional, but recommended to guide camera movement and actions
Example: "A cat chasing butterflies on the grass, bright sunshine, slow motion"
image_urls
array<string>
Image URL array. Providing it automatically enters Image-to-Video mode
  • 1 image: first-frame to video
  • 2 images: first-last frame to video (1st = first frame, 2nd = last frame)
Use either this or image_with_roles
image_urls conflicts with audio_url; they cannot be provided at the same time
image_with_roles
array<object>
Image array with roles, alternative to image_urls, used to precisely specify the role of each imageFields for each object:
  • url (string): image URL (supports http/https)
  • role (string): image role, first_frame / last_frame, default first_frame
Example:
[
  { "url": "https://cdn.example.com/start.jpg", "role": "first_frame" },
  { "url": "https://cdn.example.com/end.jpg", "role": "last_frame" }
]
image_with_roles conflicts with audio_url; they cannot be provided at the same time
video_urls
array<string>
Video URL array. Providing it enters video continuation mode (only the 1st video is used)
video_urls conflicts with audio_url; they cannot be provided at the same time
Video constraints:
  • Format: mp4, mov
  • Duration: 2–10s
  • Resolution: width and height in the range [240, 4096] pixels
  • Aspect ratio: 1:8 – 8:1
  • File size: up to 100MB
negative_prompt
string
Negative prompt describing unwanted content, up to 500 charactersExample: "blurry, distorted, low quality"
resolution
string
default:"1080P"
Video resolutionOptions:
  • 720P - Standard
  • 1080P - High definition (default)
duration
integer
default:"5"
Video duration (seconds)Supported range: 2 ~ 15 secondsDefault: 5
size
string
default:"16:9"
Aspect ratio, only effective in Text-to-Video mode (when no image/video provided)Supported formats:
  • 16:9 - Landscape widescreen (default)
  • 9:16 - Portrait
  • 1:1 - Square
  • 4:3 - Landscape
  • 3:4 - Portrait
This parameter is ignored in Image-to-Video mode; the aspect ratio is determined automatically by the input image
audio_url
string
Custom audio URL
  • Text-to-Video mode: used as background music
  • Image-to-Video mode: used as driving audio, synchronized with on-screen actions
Format: wav / mp3, duration 2-30 seconds, file size ≤ 15MB
audio_url conflicts with video_urls, image_urls, and image_with_roles; they cannot be provided at the same time
prompt_extend
boolean
default:"true"
Whether to enable intelligent prompt rewritingSignificantly improves results for short prompts, but increases processing timeDefault: true
watermark
boolean
default:"false"
Whether to add “AI Generated” watermark to the generated video
  • true: add watermark
  • false: no watermark (default)
seed
integer
Seed integer used to control the randomness of generated contentValue range: integer ≥0
  • For identical requests, the model generates different results when receiving different seed values (e.g., omitting seed)
  • For identical requests, the model generates similar results when receiving the same seed value, but exact consistency is not guaranteed

Response

code
integer
Response status code, 200 on success
data
array
Response data array

Use Cases

Case 1: Text-to-Video (Simplest Request)

{
  "model": "wan2.7",
  "prompt": "A coastal road at sunset, slow-motion camera push-in, cinematic feel"
}

Case 2: Text-to-Video (Full Parameters)

{
  "model": "wan2.7",
  "prompt": "A cat chasing butterflies on the grass, bright sunshine, slow motion",
  "negative_prompt": "blurry, distorted, low quality",
  "resolution": "1080P",
  "duration": 8,
  "size": "16:9",
  "audio_url": "https://cdn.example.com/bgm.mp3",
  "prompt_extend": true,
  "watermark": false,
  "seed": 42
}

Case 3: First-Frame to Video

{
  "model": "wan2.7",
  "prompt": "The character slowly stands up and walks toward the camera",
  "image_urls": ["https://cdn.example.com/person.jpg"],
  "resolution": "1080P",
  "duration": 8
}

Case 4: First-Last Frame to Video

{
  "model": "wan2.7",
  "prompt": "The camera pans slowly from the beach to the mountaintop",
  "image_urls": [
    "https://cdn.example.com/beach.jpg",
    "https://cdn.example.com/mountain.jpg"
  ],
  "resolution": "1080P",
  "duration": 10
}
With 2 images: the 1st is the first frame, the 2nd is the last frame. You can also use image_with_roles for precise specification.

Case 5: Video Continuation

{
  "model": "wan2.7",
  "prompt": "Continue walking forward, camera follows",
  "video_urls": ["https://cdn.example.com/clip.mp4"],
  "resolution": "1080P",
  "duration": 8
}

Case 6: Image + Driving Audio

{
  "model": "wan2.7",
  "prompt": "The character moves to the rhythm of the music",
  "image_urls": ["https://cdn.example.com/dancer.jpg"],
  "audio_url": "https://cdn.example.com/beat.mp3",
  "resolution": "1080P",
  "duration": 8
}

Mode Selection Guide

RequirementRecommended Approach
Generate video from text onlyPass only prompt (no image/video)
Make an image “come alive”Pass 1 image to image_urls
Control start and end framesPass 2 images to image_urls (first + last)
Extend an existing videoPass video to video_urls
Make image move to musicPass image + audio_url
Query Task ResultsVideo generation is an async task that returns a task_id upon submission. Use the Get Task Status endpoint to query generation progress and results.