aliyun-vidu-video
$
npx mdskill add cinience/alicloud-skills/aliyun-vidu-videoGenerates videos using DashScope Vidu models via async API
- Solves text-to-video, image-to-video, and keyframe-to-video generation tasks
- Depends on DashScope Vidu models and Alibaba Cloud video-synthesis API
- Selects appropriate model based on input type (text, image, or keyframes)
- Delivers video URLs by polling task status and saving results to output directory
SKILL.md
.github/skills/aliyun-vidu-videoView on GitHub ↗
---
name: aliyun-vidu-video
description: Use when generating videos with DashScope Vidu models. Use when implementing text-to-video, image-to-video (first frame), keyframe-to-video (first+last frame), or reference-to-video generation via the video-synthesis async API.
---
# Vidu Video Generation
## Validation
```bash
mkdir -p output/aliyun-vidu-video
python -m py_compile skills/ai/video/aliyun-vidu-video/scripts/generate_vidu_video.py && echo "py_compile_ok" > output/aliyun-vidu-video/validate.txt
```
Pass criteria: command exits 0 and `output/aliyun-vidu-video/validate.txt` is generated.
## Output And Evidence
- Save task IDs, polling responses, and final video URLs to `output/aliyun-vidu-video/`.
- Keep at least one end-to-end run log for troubleshooting.
## Prerequisites
- Set `DASHSCOPE_API_KEY` in your environment (Beijing region key required).
- Region: China Mainland (Beijing) only. Model, Endpoint URL, and API Key must belong to the same region.
- Enable Vidu models in the Alibaba Cloud Model Studio console before first use.
## Critical model names
### Text-to-video
- `vidu/viduq3-pro_text2video`
- `vidu/viduq3-turbo_text2video`
- `vidu/viduq2_text2video`
### Image-to-video (first frame)
- `vidu/viduq3-pro_img2video`
- `vidu/viduq3-turbo_img2video`
- `vidu/viduq2-pro_img2video`
- `vidu/viduq2-turbo_img2video`
### Keyframe-to-video (first+last frame)
- `vidu/viduq3-pro_start-end2video`
- `vidu/viduq3-turbo_start-end2video`
- `vidu/viduq2-pro_start-end2video`
- `vidu/viduq2-turbo_start-end2video`
### Reference-to-video
- `vidu/viduq2_reference2video`
- `vidu/viduq2-pro_reference2video`
## Capabilities
| Capability | Description | Model suffix | Required input |
|---|---|---|---|
| Text-to-video | Generate video from text prompt only | `_text2video` | `prompt` |
| Image-to-video | Generate video from a single image + optional prompt | `_img2video` | `media[image]` |
| Keyframe-to-video | Interpolate video between first and last frame images | `_start-end2video` | `media[image x2]` + `prompt` |
| Reference-to-video | Embed reference subject(s) into prompted scene | `_reference2video` | `media[image 1-7]` + `prompt` |
## API endpoint (async only)
```
POST https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis
```
Required headers:
- `Authorization: Bearer $DASHSCOPE_API_KEY`
- `Content-Type: application/json`
- `X-DashScope-Async: enable`
## Normalized interface
### Request
- `model` (string, required) -- one of the model names listed above
- `input.prompt` (string) -- up to 5000 characters, describes desired video content
- Required for text-to-video, keyframe, and reference modes
- Optional for image-to-video
- `input.media` (array) -- media objects with `type` and `url` fields (not used for text-to-video)
- `type`: `image` or `video`
- `url`: public URL (HTTP/HTTPS)
- `parameters.resolution` (string, optional) -- `540P`, `720P` (default), or `1080P`
- `parameters.size` (string, optional) -- pixel dimensions `width*height` (e.g., `1280*720`). Values depend on resolution tier. For text-to-video and reference-to-video, explicit size values are supported.
- `parameters.duration` (integer, optional) -- video length in seconds
- Q3 models: [1, 16], default 5
- Q2 models: [1, 10], default 5
- `parameters.audio` (boolean, optional) -- generate audio track (Q3 models only, default false)
- `parameters.watermark` (boolean, optional) -- add "AI generated" watermark (default false)
- `parameters.seed` (integer, optional) -- range [0, 2147483647]
### Size values by resolution tier (text-to-video)
| Resolution | Aspect ratio | Size (width*height) |
|---|---|---|
| 540P | 16:9 | 960*528 |
| 540P | 9:16 | 528*960 |
| 540P | 1:1 | 720*720 |
| 540P | 4:3 | 816*608 |
| 540P | 3:4 | 608*816 |
| 720P | 16:9 | 1280*720 |
| 720P | 9:16 | 720*1280 |
| 720P | 1:1 | 960*960 |
| 720P | 4:3 | 1104*816 |
| 720P | 3:4 | 816*1104 |
| 1080P | 16:9 | 1920*1080 |
| 1080P | 9:16 | 1080*1920 |
| 1080P | 1:1 | 1440*1440 |
| 1080P | 4:3 | 1674*1238 |
| 1080P | 3:4 | 1238*1674 |
### Size values by resolution tier (reference-to-video)
| Resolution | Aspect ratio | Size (width*height) |
|---|---|---|
| 540P | 16:9 | 960*540 |
| 540P | 9:16 | 540*960 |
| 540P | 1:1 | 540*540 |
| 540P | 4:3 | 720*540 |
| 540P | 3:4 | 540*720 |
| 720P | 16:9 | 1280*720 |
| 720P | 9:16 | 720*1280 |
| 720P | 1:1 | 720*720 |
| 720P | 4:3 | 960*720 |
| 720P | 3:4 | 720*960 |
| 1080P | 16:9 | 1920*1080 |
| 1080P | 9:16 | 1080*1920 |
| 1080P | 1:1 | 1080*1080 |
| 1080P | 4:3 | 1440*1080 |
| 1080P | 3:4 | 1080*1440 |
### Media input limits
**Images** (type=image):
- Formats: JPG, PNG, WEBP
- Aspect ratio: 1:4 to 4:1
- Max size: 50MB
**Videos** (type=video, reference-to-video only):
- Formats: mp4, avi, mov
- Resolution: min 128x128 pixels
- Aspect ratio: 1:4 to 4:1
- Duration: 1-5s
- Max size: 50MB
### Response (task creation)
- `output.task_id` (string) -- use for polling, valid 24 hours
- `output.task_status` (string) -- PENDING | RUNNING | SUCCEEDED | FAILED | CANCELED | UNKNOWN
- `request_id` (string)
### Response (task result)
- `output.video_url` (string) -- generated video URL (MP4, H.264), valid 24 hours
- `output.orig_prompt` (string) -- original prompt
- `usage.duration` (integer) -- billable video duration in seconds
- `usage.output_video_duration` (integer) -- actual output duration
- `usage.size` (string) -- output resolution
- `usage.fps` (integer) -- frame rate (24)
- `usage.audio` (boolean) -- whether audio was generated
- `usage.SR` (string) -- resolution tier
## Quick start (Python + HTTP)
```python
import os
import json
import time
import requests
API_KEY = os.getenv("DASHSCOPE_API_KEY")
BASE_URL = "https://dashscope.aliyuncs.com/api/v1"
def create_vidu_task(req: dict) -> str:
"""Create a Vidu video generation task and return task_id."""
payload = {
"model": req["model"],
"input": {},
"parameters": {
"resolution": req.get("resolution", "720P"),
"duration": req.get("duration", 5),
},
}
if req.get("prompt"):
payload["input"]["prompt"] = req["prompt"]
if req.get("media"):
payload["input"]["media"] = req["media"]
if req.get("size"):
payload["parameters"]["size"] = req["size"]
if req.get("audio") is not None:
payload["parameters"]["audio"] = req["audio"]
if req.get("watermark") is not None:
payload["parameters"]["watermark"] = req["watermark"]
if req.get("seed") is not None:
payload["parameters"]["seed"] = req["seed"]
resp = requests.post(
f"{BASE_URL}/services/aigc/video-generation/video-synthesis",
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
"X-DashScope-Async": "enable",
},
json=payload,
)
resp.raise_for_status()
data = resp.json()
return data["output"]["task_id"]
def poll_task(task_id: str, interval: int = 15) -> dict:
"""Poll until task completes. Returns final response."""
while True:
resp = requests.get(
f"{BASE_URL}/tasks/{task_id}",
headers={"Authorization": f"Bearer {API_KEY}"},
)
resp.raise_for_status()
data = resp.json()
status = data["output"]["task_status"]
if status in ("SUCCEEDED", "FAILED", "CANCELED"):
return data
time.sleep(interval)
```
## Mode-specific examples
```python
# Text-to-video
task_id = create_vidu_task({
"model": "vidu/viduq3-turbo_text2video",
"prompt": "A cat running under moonlight",
"resolution": "540P",
"size": "960*528",
"duration": 5,
})
# Image-to-video (first frame)
task_id = create_vidu_task({
"model": "vidu/viduq3-pro_img2video",
"prompt": "Camera slowly pans upward",
"media": [{"type": "image", "url": "https://example.com/image.jpg"}],
"resolution": "720P",
"duration": 5,
})
# Keyframe-to-video (first + last frame)
task_id = create_vidu_task({
"model": "vidu/viduq3-turbo_start-end2video",
"prompt": "A cat jumps from windowsill to sofa",
"media": [
{"type": "image", "url": "https://example.com/first.png"},
{"type": "image", "url": "https://example.com/last.png"},
],
"resolution": "540P",
"duration": 5,
})
# Reference-to-video
task_id = create_vidu_task({
"model": "vidu/viduq2_reference2video",
"prompt": "Man playing guitar in a cafe",
"media": [
{"type": "image", "url": "https://example.com/ref1.jpg"},
{"type": "image", "url": "https://example.com/ref2.jpg"},
],
"resolution": "720P",
"size": "1280*720",
"duration": 5,
})
```
## Error handling
| Error | Likely cause | Action |
|---|---|---|
| 401/403 | Missing or invalid `DASHSCOPE_API_KEY` | Check env var; ensure Beijing region key |
| 400 `InvalidParameter` | Unsupported resolution/size combo, bad duration, missing media | Validate parameters against size tables |
| "does not support synchronous calls" | Missing `X-DashScope-Async: enable` header | Add required header |
| 429 | Rate limit or quota | Retry with backoff |
| Cross-region error | Model and API Key from different regions | Ensure all are Beijing region |
## Output location
- Default output: `output/aliyun-vidu-video/videos/`
- Override base dir with `OUTPUT_DIR`.
## Anti-patterns
- Do not use model names not listed in "Critical model names" above.
- Do not call this API synchronously -- async header is required.
- Do not omit `size` when using reference-to-video -- it is required for that mode.
- Do not pass `audio=true` with Q2 models -- only Q3 models support audio generation.
- Video URLs expire after 24 hours; download and persist immediately.
- For image-to-video, supply exactly 1 image. For keyframe, supply exactly 2 images (first, then last).
- For reference-to-video with `viduq2_reference2video`, only images are accepted (1-7). For `viduq2-pro_reference2video`, images (1-4) plus optional videos (1-2) are accepted.
- Keyframe mode: first and last frame pixel count ratio must be between 0.8 and 1.25.
## Workflow
1) Confirm user intent: text-to-video, image-to-video, keyframe, or reference-to-video.
2) Select the appropriate model name based on capability and quality tier (Q3 pro/turbo or Q2).
3) Prepare input: prompt and/or media array with correct types and valid public URLs.
4) Set resolution, size, and duration parameters.
5) Create async task and poll for results (15s interval recommended).
6) Download and save generated video before URL expiration (24 hours).
## References
- See `references/api_reference.md` for full HTTP API details.
- See `references/sources.md` for source links.
More from cinience/alicloud-skills
- aliyun-adb-mysqlUse when managing Alibaba Cloud AnalyticDB for MySQL (ADB) via OpenAPI/SDK, including the user needs AnalyticDB resource lifecycle and configuration operations, status checks, or troubleshooting ADB API and cluster workflow issues.
- aliyun-adb-mysql-testSmoke test for aliyun-adb-mysql. Validate minimal authentication, API reachability, and one read-only query path.
- aliyun-aicontent-generateUse when managing Alibaba Cloud AIContent (AiContent) via OpenAPI/SDK, including the user needs AI content generation or content workflow operations in Alibaba Cloud, including listing assets, creating/updating generation configurations, checking task status, or troubleshooting failed content jobs.
- aliyun-aicontent-generate-testSmoke test for aliyun-aicontent-generate. Validate minimal authentication, API reachability, and one read-only query path.
- aliyun-aimiaobi-generateUse when managing Alibaba Cloud Quan Miao (AiMiaoBi) via OpenAPI/SDK, including the user asks for Alibaba Cloud MiaoBi content operations, including listing resources, creating/updating configurations, querying runtime status, and diagnosing API or workflow failures.
- aliyun-aimiaobi-generate-testSmoke test for aliyun-aimiaobi-generate. Validate minimal authentication, API reachability, and one read-only query path.
- aliyun-airec-manageUse when managing Alibaba Cloud AIRec (Airec) via OpenAPI/SDK, including the user needs recommendation-engine resource operations in Alibaba Cloud, including list/create/update flows, status inspection, and troubleshooting AIRec configuration or runtime issues.
- aliyun-airec-manage-testSmoke test for aliyun-airec-manage. Validate minimal authentication, API reachability, and one read-only query path.
- aliyun-alb-manageUse when managing and troubleshoot Alibaba Cloud ALB (Application Load Balancer), including the user asks to inspect, create, change, or debug ALB instances, listeners, server groups, rules, certificates, ACLs, security policies, or health checks in Alibaba Cloud.
- aliyun-alb-manage-testSmoke test for Alibaba Cloud ALB skill. Validates SDK auth, script compilation, list instances, and health check flows.