media-accessibility
$
npx mdskill add Community-Access/accessibility-agents/media-accessibilityGenerate accessible media captions and transcripts for WCAG compliance.
- Ensures video and audio content meets WCAG 1.2.x standards.
- Supports WebVTT, SRT, and TTML caption file formats.
- Maps accessibility criteria to specific media requirements.
- Delivers synchronized captions and audio descriptions.
SKILL.md
.github/skills/media-accessibilityView on GitHub ↗
--- name: media-accessibility description: Video, audio, and streaming media accessibility reference. WebVTT/SRT/TTML caption formats, audio description patterns, accessible media player ARIA, WCAG 1.2.x criteria mapping, caption quality guidelines, and live captioning integration. --- # Media Accessibility Skill Reference data for video, audio, and streaming media accessibility. Used by `media-accessibility` agent and any web agent encountering `<video>` or `<audio>` elements. --- ## WCAG 1.2 Time-Based Media Criteria | SC | Level | Requirement | Test | |----|-------|-------------|------| | 1.2.1 | A | Audio-only/video-only: provide text transcript or audio track | Transcript exists alongside media | | 1.2.2 | A | Captions for prerecorded audio in video | Synchronized captions present | | 1.2.3 | A | Audio description OR text alternative for prerecorded video | Description track or full transcript | | 1.2.4 | AA | Captions for live audio in video | Real-time captioning active | | 1.2.5 | AA | Audio description for prerecorded video | Description track present | | 1.2.6 | AAA | Sign language for prerecorded audio | Sign language interpretation video | | 1.2.7 | AAA | Extended audio description | Paused video for long descriptions | | 1.2.8 | AAA | Media alternative for prerecorded video | Full text alternative document | | 1.2.9 | AAA | Audio-only (live): text alternative | Real-time text transcript | ## Caption File Formats ### WebVTT (Web Video Text Tracks) ```text WEBVTT 00:00:01.000 --> 00:00:04.000 Welcome to the accessibility course. 00:00:04.500 --> 00:00:08.000 Today we'll cover caption best practices. 00:00:08.500 --> 00:00:12.000 <v Speaker 2>Let's start with file formats. ``` **Key rules:** - File must start with `WEBVTT` header - Timestamps: `HH:MM:SS.mmm --> HH:MM:SS.mmm` - Speaker identification: `<v Speaker Name>Text` - Maximum 2 lines per caption, 32 characters per line - Minimum display time: 1 second - Maximum display time: 7 seconds ### SRT (SubRip Text) ```text 1 00:00:01,000 --> 00:00:04,000 Welcome to the accessibility course. 2 00:00:04,500 --> 00:00:08,000 Today we'll cover caption best practices. ``` **Key rules:** - Sequential numbering starting at 1 - Timestamps use comma for milliseconds (not period) - Blank line between entries - No styling support (unlike WebVTT) ### TTML (Timed Text Markup Language) - XML-based, used in broadcast and streaming - Supports styling, positioning, and region layout - IMSC (Internet Media Subtitles and Captions) is the web profile ## Caption Quality Guidelines | Metric | Requirement | |--------|------------| | Accuracy | 99%+ word accuracy | | Synchronization | Within 1 second of audio | | Speaker identification | Required when 2+ speakers | | Sound effects | Describe in brackets: `[applause]`, `[phone rings]` | | Music | Describe: `[upbeat music]`, `[dramatic orchestral score]` | | Caption rate | Maximum 200 words per minute (3 words/second) | | Line length | Maximum 32 characters per line | | Lines per caption | Maximum 2 lines | | Placement | Bottom center default; move for on-screen text | ## Audio Description Audio description narrates visual information during pauses in dialogue. **Requirements:** - Describe: actions, scene changes, on-screen text, facial expressions relevant to plot - Don't describe: obvious audio cues, subjective interpretations - Timing: fit into natural pauses; for extended descriptions (1.2.7), video pauses automatically - Voice: distinct from program audio, clear and neutral **HTML implementation:** ```html <video controls> <source src="video.mp4" type="video/mp4"> <track kind="captions" src="captions-en.vtt" srclang="en" label="English" default> <track kind="descriptions" src="descriptions-en.vtt" srclang="en" label="English Audio Descriptions"> </video> ``` ## Accessible Media Player ARIA Patterns ### Minimum Controls Every media player must have keyboard-accessible controls for: - Play/Pause (`role="button"`, `aria-label="Play"` / `aria-label="Pause"`) - Volume (`role="slider"`, `aria-label="Volume"`, `aria-valuemin`, `aria-valuemax`, `aria-valuenow`) - Seek/Progress (`role="slider"`, `aria-label="Seek"`, `aria-valuetext="2 minutes 30 seconds"`) - Captions toggle (`role="button"`, `aria-pressed`, `aria-label="Toggle captions"`) - Fullscreen (`role="button"`, `aria-label="Enter fullscreen"` / `aria-label="Exit fullscreen"`) ### Live Region Announcements ```html <div aria-live="polite" class="sr-only" id="player-status"> <!-- Announce: "Playing", "Paused", "Video ended", "Captions on", "Captions off" --> </div> ``` ### Keyboard Shortcuts (Media Player Convention) | Key | Action | |-----|--------| | Space / Enter | Play/Pause | | Left Arrow | Rewind 5 seconds | | Right Arrow | Forward 5 seconds | | Up Arrow | Volume up | | Down Arrow | Volume down | | M | Mute/Unmute | | C | Toggle captions | | F | Toggle fullscreen | ## Transcript Best Practices - Provide a full-text transcript adjacent to or linked from the media - Include speaker identification, timestamps (optional but helpful), and non-speech audio - Transcripts benefit deaf users, search engines, and users who prefer reading - Interactive transcripts (click a line to jump to that point) are excellent UX ## Live Captioning - WCAG 1.2.4 (AA) requires captions for live audio in synchronized media - Options: human captioner (CART), AI-powered auto-captioning, hybrid - AI auto-captions alone rarely meet the 99% accuracy requirement — human review or correction is recommended - WebSocket-based caption delivery for web: push text to a `<div aria-live="polite">` container ## Common Violations | Issue | WCAG SC | Severity | |-------|---------|----------| | No captions on prerecorded video | 1.2.2 | Critical | | Auto-generated captions without review | 1.2.2 | Serious | | No audio description for visual-only content | 1.2.5 | Serious | | Inaccessible media player controls | 2.1.1, 4.1.2 | Critical | | No keyboard access to play/pause | 2.1.1 | Critical | | Autoplay without user control | 1.4.2 | Serious | | Volume slider not keyboard-operable | 2.1.1 | Serious | | No transcript for audio-only content | 1.2.1 | Critical | | Captions poorly synchronized (>2s drift) | 1.2.2 | Moderate | | Missing speaker identification in captions | 1.2.2 | Moderate |