aliyun-qwen-asr
$
npx mdskill add cinience/alicloud-skills/aliyun-qwen-asrTranscribes non-realtime audio using Alibaba Cloud Qwen ASR models
- Converts recorded audio files into text with timestamps
- Uses Alibaba Cloud Model Studio Qwen ASR APIs
- Selects appropriate model based on audio length and task type
- Stores transcripts and API responses in specified output directory
SKILL.md
.github/skills/aliyun-qwen-asrView on GitHub ↗
---
name: aliyun-qwen-asr
description: Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.
version: 1.0.0
---
Category: provider
# Model Studio Qwen ASR (Non-Realtime)
## Validation
```bash
mkdir -p output/aliyun-qwen-asr
python -m py_compile skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/aliyun-qwen-asr/validate.txt
```
Pass criteria: command exits 0 and `output/aliyun-qwen-asr/validate.txt` is generated.
## Output And Evidence
- Store transcripts and API responses under `output/aliyun-qwen-asr/`.
- Keep one command log or sample response per run.
Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs.
## Critical model names
Use one of these exact model strings:
- `qwen3-asr-flash`
- `qwen3-asr-flash-2026-02-10`
- `qwen-audio-asr`
- `qwen3-asr-flash-filetrans`
- `qwen3-asr-flash-filetrans-2025-11-17`
Selection guidance:
- Use `qwen3-asr-flash`, `qwen3-asr-flash-2026-02-10`, or `qwen-audio-asr` for short/normal recordings (sync).
- Use `qwen3-asr-flash-filetrans` or `qwen3-asr-flash-filetrans-2025-11-17` for long-file transcription (async task workflow).
## Prerequisites
- Install SDK dependencies (script uses Python stdlib only):
```bash
python3 -m venv .venv
. .venv/bin/activate
```
- Set `DASHSCOPE_API_KEY` in environment, or add `dashscope_api_key` to `~/.alibabacloud/credentials`.
## Normalized interface (asr.transcribe)
### Request
- `audio` (string, required): public URL or local file path.
- `model` (string, optional): default `qwen3-asr-flash`.
- `language_hints` (array<string>, optional): e.g. `zh`, `en`.
- `sample_rate` (number, optional)
- `vocabulary_id` (string, optional)
- `disfluency_removal_enabled` (bool, optional)
- `timestamp_granularities` (array<string>, optional): e.g. `sentence`.
- `async` (bool, optional): default false for sync models, true for `qwen3-asr-flash-filetrans`.
### Response
- `text` (string): normalized transcript text.
- `task_id` (string, optional): present for async submission.
- `status` (string): `SUCCEEDED` or submission status.
- `raw` (object): original API response.
## Quick start (official HTTP API)
Sync transcription (OpenAI-compatible protocol):
```bash
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash",
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}
]
}
],
"stream": false,
"asr_options": {
"enable_itn": false
}
}'
```
Async long-file transcription (DashScope protocol):
```bash
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash-filetrans",
"input": {
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}'
```
Poll task result:
```bash
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"
```
## Local helper script
Use the bundled script for URL/local-file input and optional async polling:
```bash
python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash \
--language-hints zh,en \
--print-response
```
Long-file mode:
```bash
python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash-filetrans \
--async \
--wait
```
## Operational guidance
- For local files, use `input_audio.data` (data URI) when direct URL is unavailable.
- Keep `language_hints` minimal to reduce recognition ambiguity.
- For async tasks, use 5-20s polling interval with max retry guard.
- Save normalized outputs under `output/aliyun-qwen-asr/transcripts/`.
## Output location
- Default output: `output/aliyun-qwen-asr/transcripts/`
- Override base dir with `OUTPUT_DIR`.
## Workflow
1) Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
2) Run one minimal read-only query first to verify connectivity and permissions.
3) Execute the target operation with explicit parameters and bounded scope.
4) Verify results and save output/evidence files.
## References
- `references/api_reference.md`
- `references/sources.md`
- Realtime synthesis is provided by `skills/ai/audio/aliyun-qwen-tts-realtime/`.
More from cinience/alicloud-skills
- aliyun-adb-mysqlUse when managing Alibaba Cloud AnalyticDB for MySQL (ADB) via OpenAPI/SDK, including the user needs AnalyticDB resource lifecycle and configuration operations, status checks, or troubleshooting ADB API and cluster workflow issues.
- aliyun-adb-mysql-testSmoke test for aliyun-adb-mysql. Validate minimal authentication, API reachability, and one read-only query path.
- aliyun-aicontent-generateUse when managing Alibaba Cloud AIContent (AiContent) via OpenAPI/SDK, including the user needs AI content generation or content workflow operations in Alibaba Cloud, including listing assets, creating/updating generation configurations, checking task status, or troubleshooting failed content jobs.
- aliyun-aicontent-generate-testSmoke test for aliyun-aicontent-generate. Validate minimal authentication, API reachability, and one read-only query path.
- aliyun-aimiaobi-generateUse when managing Alibaba Cloud Quan Miao (AiMiaoBi) via OpenAPI/SDK, including the user asks for Alibaba Cloud MiaoBi content operations, including listing resources, creating/updating configurations, querying runtime status, and diagnosing API or workflow failures.
- aliyun-aimiaobi-generate-testSmoke test for aliyun-aimiaobi-generate. Validate minimal authentication, API reachability, and one read-only query path.
- aliyun-airec-manageUse when managing Alibaba Cloud AIRec (Airec) via OpenAPI/SDK, including the user needs recommendation-engine resource operations in Alibaba Cloud, including list/create/update flows, status inspection, and troubleshooting AIRec configuration or runtime issues.
- aliyun-airec-manage-testSmoke test for aliyun-airec-manage. Validate minimal authentication, API reachability, and one read-only query path.
- aliyun-alb-manageUse when managing and troubleshoot Alibaba Cloud ALB (Application Load Balancer), including the user asks to inspect, create, change, or debug ALB instances, listeners, server groups, rules, certificates, ACLs, security policies, or health checks in Alibaba Cloud.
- aliyun-alb-manage-testSmoke test for Alibaba Cloud ALB skill. Validates SDK auth, script compilation, list instances, and health check flows.