media-accessibility

$npx mdskill add Community-Access/accessibility-agents/media-accessibility

Generate accessible media captions and transcripts for WCAG compliance.

  • Ensures video and audio content meets WCAG 1.2.x standards.
  • Supports WebVTT, SRT, and TTML caption file formats.
  • Maps accessibility criteria to specific media requirements.
  • Delivers synchronized captions and audio descriptions.

SKILL.md

.github/skills/media-accessibilityView on GitHub ↗
---
name: media-accessibility
description: Video, audio, and streaming media accessibility reference. WebVTT/SRT/TTML caption formats, audio description patterns, accessible media player ARIA, WCAG 1.2.x criteria mapping, caption quality guidelines, and live captioning integration.
---

# Media Accessibility Skill

Reference data for video, audio, and streaming media accessibility. Used by `media-accessibility` agent and any web agent encountering `<video>` or `<audio>` elements.

---

## WCAG 1.2 Time-Based Media Criteria

| SC | Level | Requirement | Test |
|----|-------|-------------|------|
| 1.2.1 | A | Audio-only/video-only: provide text transcript or audio track | Transcript exists alongside media |
| 1.2.2 | A | Captions for prerecorded audio in video | Synchronized captions present |
| 1.2.3 | A | Audio description OR text alternative for prerecorded video | Description track or full transcript |
| 1.2.4 | AA | Captions for live audio in video | Real-time captioning active |
| 1.2.5 | AA | Audio description for prerecorded video | Description track present |
| 1.2.6 | AAA | Sign language for prerecorded audio | Sign language interpretation video |
| 1.2.7 | AAA | Extended audio description | Paused video for long descriptions |
| 1.2.8 | AAA | Media alternative for prerecorded video | Full text alternative document |
| 1.2.9 | AAA | Audio-only (live): text alternative | Real-time text transcript |

## Caption File Formats

### WebVTT (Web Video Text Tracks)

```text
WEBVTT

00:00:01.000 --> 00:00:04.000
Welcome to the accessibility course.

00:00:04.500 --> 00:00:08.000
Today we'll cover caption best practices.

00:00:08.500 --> 00:00:12.000
<v Speaker 2>Let's start with file formats.
```

**Key rules:**

- File must start with `WEBVTT` header
- Timestamps: `HH:MM:SS.mmm --> HH:MM:SS.mmm`
- Speaker identification: `<v Speaker Name>Text`
- Maximum 2 lines per caption, 32 characters per line
- Minimum display time: 1 second
- Maximum display time: 7 seconds

### SRT (SubRip Text)

```text
1
00:00:01,000 --> 00:00:04,000
Welcome to the accessibility course.

2
00:00:04,500 --> 00:00:08,000
Today we'll cover caption best practices.
```

**Key rules:**

- Sequential numbering starting at 1
- Timestamps use comma for milliseconds (not period)
- Blank line between entries
- No styling support (unlike WebVTT)

### TTML (Timed Text Markup Language)

- XML-based, used in broadcast and streaming
- Supports styling, positioning, and region layout
- IMSC (Internet Media Subtitles and Captions) is the web profile

## Caption Quality Guidelines

| Metric | Requirement |
|--------|------------|
| Accuracy | 99%+ word accuracy |
| Synchronization | Within 1 second of audio |
| Speaker identification | Required when 2+ speakers |
| Sound effects | Describe in brackets: `[applause]`, `[phone rings]` |
| Music | Describe: `[upbeat music]`, `[dramatic orchestral score]` |
| Caption rate | Maximum 200 words per minute (3 words/second) |
| Line length | Maximum 32 characters per line |
| Lines per caption | Maximum 2 lines |
| Placement | Bottom center default; move for on-screen text |

## Audio Description

Audio description narrates visual information during pauses in dialogue.

**Requirements:**

- Describe: actions, scene changes, on-screen text, facial expressions relevant to plot
- Don't describe: obvious audio cues, subjective interpretations
- Timing: fit into natural pauses; for extended descriptions (1.2.7), video pauses automatically
- Voice: distinct from program audio, clear and neutral

**HTML implementation:**

```html
<video controls>
  <source src="video.mp4" type="video/mp4">
  <track kind="captions" src="captions-en.vtt" srclang="en" label="English" default>
  <track kind="descriptions" src="descriptions-en.vtt" srclang="en" label="English Audio Descriptions">
</video>
```

## Accessible Media Player ARIA Patterns

### Minimum Controls

Every media player must have keyboard-accessible controls for:

- Play/Pause (`role="button"`, `aria-label="Play"` / `aria-label="Pause"`)
- Volume (`role="slider"`, `aria-label="Volume"`, `aria-valuemin`, `aria-valuemax`, `aria-valuenow`)
- Seek/Progress (`role="slider"`, `aria-label="Seek"`, `aria-valuetext="2 minutes 30 seconds"`)
- Captions toggle (`role="button"`, `aria-pressed`, `aria-label="Toggle captions"`)
- Fullscreen (`role="button"`, `aria-label="Enter fullscreen"` / `aria-label="Exit fullscreen"`)

### Live Region Announcements

```html
<div aria-live="polite" class="sr-only" id="player-status">
  <!-- Announce: "Playing", "Paused", "Video ended", "Captions on", "Captions off" -->
</div>
```

### Keyboard Shortcuts (Media Player Convention)

| Key | Action |
|-----|--------|
| Space / Enter | Play/Pause |
| Left Arrow | Rewind 5 seconds |
| Right Arrow | Forward 5 seconds |
| Up Arrow | Volume up |
| Down Arrow | Volume down |
| M | Mute/Unmute |
| C | Toggle captions |
| F | Toggle fullscreen |

## Transcript Best Practices

- Provide a full-text transcript adjacent to or linked from the media
- Include speaker identification, timestamps (optional but helpful), and non-speech audio
- Transcripts benefit deaf users, search engines, and users who prefer reading
- Interactive transcripts (click a line to jump to that point) are excellent UX

## Live Captioning

- WCAG 1.2.4 (AA) requires captions for live audio in synchronized media
- Options: human captioner (CART), AI-powered auto-captioning, hybrid
- AI auto-captions alone rarely meet the 99% accuracy requirement — human review or correction is recommended
- WebSocket-based caption delivery for web: push text to a `<div aria-live="polite">` container

## Common Violations

| Issue | WCAG SC | Severity |
|-------|---------|----------|
| No captions on prerecorded video | 1.2.2 | Critical |
| Auto-generated captions without review | 1.2.2 | Serious |
| No audio description for visual-only content | 1.2.5 | Serious |
| Inaccessible media player controls | 2.1.1, 4.1.2 | Critical |
| No keyboard access to play/pause | 2.1.1 | Critical |
| Autoplay without user control | 1.4.2 | Serious |
| Volume slider not keyboard-operable | 2.1.1 | Serious |
| No transcript for audio-only content | 1.2.1 | Critical |
| Captions poorly synchronized (>2s drift) | 1.2.2 | Moderate |
| Missing speaker identification in captions | 1.2.2 | Moderate |

More from Community-Access/accessibility-agents

SkillDescription
Accessibility LeadAccessibility team lead and orchestrator. Use on EVERY task that involves web UI code, HTML, JSX, CSS, React components, web pages, or any user-facing web content. This agent coordinates the accessibility specialist team and ensures no accessibility requirement is missed. Runs the final review before any UI code is considered complete. Applies to any web framework or vanilla HTML/CSS/JS.
Accessibility Regression DetectorDetects accessibility regressions by comparing audit results across commits/branches. Tracks score trends and validates previous fixes.
Accessibility Statement GeneratorGenerates conformance/accessibility statements following W3C or EU model templates. Maps audit results to conformance claims, known limitations, and contact information.
Accessibility Tool BuilderExpert in building accessibility scanning tools, rule engines, document parsers, report generators, and audit automation. WCAG criterion mapping, severity scoring, CLI/GUI scanner architecture, CI/CD integration.
Accessibility TrackerTrack accessibility improvements across VS Code and any configured repos -- get summaries, deep dives, workspace reports, WCAG cross-references, and proactive alerts on a11y changes.
accessibility-rulesCross-format document accessibility rule reference with WCAG 2.2 mapping. Use when looking up accessibility rules for Word (DOCX-*), Excel (XLSX-*), PowerPoint (PPTX-*), or PDF (PDFUA.*, PDFBP.*, PDFQ.*) documents, or when mapping findings to WCAG success criteria for compliance reporting.
Actions ManagerGitHub Actions command center -- view workflow runs, read logs, re-run failed jobs, manage workflows, and debug CI failures entirely from the editor. Bypasses the deeply nested, visually-dependent Actions UI that is largely inaccessible to screen readers.
Alt Text & HeadingsAlternative text and heading structure specialist for web applications. Use when building or reviewing any page with images, icons, SVGs, videos, figures, charts, or heading hierarchies. Covers meaningful vs decorative images, complex image descriptions, heading levels, document outline, and landmark structure. Can analyze images visually, compare existing alt text against image content, and interactively suggest appropriate alternatives. Applies to any web framework or vanilla HTML/CSS/JS.
Analytics & InsightsYour GitHub analytics command center -- team velocity, review turnaround, issue resolution metrics, contribution activity, bottleneck detection, and code churn analysis with dual markdown + HTML reports.
ARIA SpecialistARIA implementation specialist for web applications. Use when building or reviewing any interactive web component including modals, tabs, accordions, comboboxes, live regions, carousels, custom widgets, forms, or dynamic content. Also use when reviewing ARIA usage for correctness. Applies to any web framework or vanilla HTML/CSS/JS.