video-ingest
$
npx mdskill add joelhooks/joelclaw/video-ingestDownloads, transcribes, and summarizes videos from URLs via a durable Inngest pipeline.
- Handles video ingestion tasks like saving YouTube videos or processing multiple URLs in batch.
- Integrates with Inngest for workflow orchestration, yt-dlp for downloads, and mlx-whisper for transcription.
- Triggers automatically based on user requests to grab, transcribe, or ingest video content.
- Delivers results by creating vault notes with enriched summaries after processing.
SKILL.md
.github/skills/video-ingestView on GitHub ↗
---
name: video-ingest
description: "Download, transcribe, and summarize videos via the Inngest pipeline. Use when the user asks to grab/download/transcribe/ingest a video, save a YouTube video, or process any video URL. Also handles batch ingest of multiple URLs. This skill triggers the durable Inngest workflow — do NOT run yt-dlp, mlx-whisper, or scp manually."
---
# Video Ingest — via Inngest Pipeline
Videos are ingested through the Inngest event bus. **Do not run yt-dlp, mlx-whisper, scp, or create vault notes manually.** The pipeline handles everything: download → NAS transfer → transcription → vault note → summary enrichment.
## Quick Start
```bash
joelclaw send pipeline/video.download -d '{"url":"URL_HERE"}'
```
That's it. The event chain handles the rest.
Alternative (raw curl):
```bash
curl -s -X POST "http://localhost:8288/e/37aa349b89692d657d276a40e0e47a15" \
-H "Content-Type: application/json" \
-d '{"name":"pipeline/video.download","data":{"url":"URL_HERE"}}'
```
## Pipeline Flow
```
pipeline/video.download — you send this
↓
video-download function — yt-dlp → /tmp → NAS transfer
↓ emits
pipeline/video.downloaded — logged by system-logger
pipeline/transcript.process — auto-triggered
↓
transcript-process function — mlx-whisper (M4 Pro, ~5min/hr of video)
↓ emits
pipeline/transcript.processed — logged
content/summarize — auto-triggered
↓
content-summarize function — pi enrichment → vault note with summary
↓ emits
content/summarized — logged, done
```
## Before Sending: Health Check
Always verify the pipeline is healthy before sending events:
```bash
# Inngest server
curl -s http://localhost:8288/health
# Worker (should show functions including video-download)
curl -s http://localhost:3111/ | python3 -c "
import json,sys
d=json.load(sys.stdin)
fns = [f.get('id','?') for f in d.get('functions',[])]
print(f'Worker OK — {len(fns)} functions: {', '.join(fns)}')
"
# Docker container running
docker ps --filter ancestor=inngest/inngest --format "table {{.Status}}\t{{.Ports}}"
```
If the worker is down:
```bash
kubectl -n joelclaw rollout restart deployment/system-bus-worker
kubectl -n joelclaw rollout status deployment/system-bus-worker --timeout=180s
joelclaw refresh
```
## Monitoring a Run
### Watch progress in real-time
```bash
# Worker logs — shows step execution + failures
kubectl logs -n joelclaw deploy/system-bus-worker -f
# Docker logs — shows event dispatch
docker logs -f $(docker ps -q --filter ancestor=inngest/inngest) 2>&1 | grep -v DEBUG
```
### Check if events fired
```bash
# Look for the video's events in Docker logs
docker logs $(docker ps -q --filter ancestor=inngest/inngest) 2>&1 | grep -i "video\|transcript\|summarize" | tail -20
```
### Dashboard
Open http://localhost:8288 in browser — shows functions, events, runs with per-step traces.
### Verify completion
```bash
# Check if vault note was created
ls -la ~/Vault/Resources/videos/*SLUG*
# Check system log for pipeline entries
tail -10 ~/Vault/system/system-log.jsonl | grep -i video
```
## Batch Ingest
Send multiple events. Inngest queues and processes them with concurrency control:
```bash
joelclaw send pipeline/video.download -d '{"url":"https://youtube.com/watch?v=XXXX"}'
joelclaw send pipeline/video.download -d '{"url":"https://youtube.com/watch?v=YYYY"}'
joelclaw send pipeline/video.download -d '{"url":"https://youtube.com/watch?v=ZZZZ"}'
```
## Manual Transcript (Non-YouTube)
For audio files already on disk, or raw text from Granola/Fathom:
```bash
# From audio file
joelclaw send pipeline/transcript.process -d '{"source":"manual","audioPath":"/path/to/audio.mp4","title":"Recording Title","slug":"recording-title"}'
# From raw text (Granola, Fathom, etc.)
joelclaw send pipeline/transcript.process -d '{"source":"granola","text":"transcript text...","title":"Meeting Title","slug":"meeting-title"}'
```
## Re-run Summary Only
If the vault note exists but needs a better summary:
```bash
joelclaw send content/summarize -d '{"vaultPath":"/Users/joel/Vault/Resources/videos/SLUG.md"}'
```
## Options
| Field | Default | Description |
|-------|---------|-------------|
| `url` | required | YouTube or video URL |
| `maxQuality` | `"1080"` | Max video resolution: `"720"`, `"1080"`, `"4k"` |
## Where Things End Up
| What | Location |
|------|----------|
| Video + metadata | NAS: `/volume1/home/joel/video/YYYY/SLUG/` |
| Vault note | `~/Vault/Resources/videos/SLUG.md` |
| Daily note link | `~/Vault/Daily/YYYY-MM-DD.md` under `## Videos` |
| System log entry | `~/Vault/system/system-log.jsonl` |
## Troubleshooting
If events are accepted (200 OK) but nothing happens:
1. **Check Docker→worker connectivity** — the most common issue:
```bash
docker logs $(docker ps -q --filter ancestor=inngest/inngest) 2>&1 | grep -E "ERROR|Unable" | tail -5
```
If you see "Unable to reach SDK URL" → see the inngest skill's serveHost gotcha.
2. **Check worker is actually running**:
```bash
kubectl get deploy -n joelclaw system-bus-worker
kubectl get pods -n joelclaw -l app=system-bus-worker
```
3. **Check worker errors for the specific function**:
```bash
kubectl logs -n joelclaw deploy/system-bus-worker --tail=80
```
4. **Use inngest-debug skill** for deep inspection of specific run IDs via GraphQL.
## What NOT to Do
- ❌ Don't run `yt-dlp` directly — the pipeline handles download + NAS transfer
- ❌ Don't run `mlx_whisper` directly — the pipeline handles transcription
- ❌ Don't `scp` to NAS manually — the pipeline handles transfer
- ❌ Don't create vault notes manually — the pipeline creates them with proper frontmatter
- ❌ Don't use codex/background tasks for video processing — Inngest is durable and has retries
More from joelhooks/joelclaw
- add-skillCreate new joelclaw skills with the idiomatic process — repo-canonical, symlinked, git-tracked, slogged. Triggers on 'add a skill', 'create skill', 'new skill', 'canonical skill', 'make a skill for', or any request to formalize a process or domain into a reusable skill.
- adr-skillCreate and maintain Architecture Decision Records (ADRs) optimized for agentic coding workflows. Use when you need to propose, write, update, accept/reject, deprecate, or supersede an ADR; bootstrap an adr folder and index; consult existing ADRs before implementing changes; or enforce ADR conventions. This skill uses Socratic questioning to capture intent before drafting, and validates output against an agent-readiness checklist.
- agent-discovery"Optimize websites, docs, and product surfaces for agent discoverability and operator UX. Use when working on agent SEO/AEO/GEO, crawl policy, markdown or JSON projections, llms.txt, sitemap.md, AGENTS.md guidance, content negotiation, accessibility for browser agents, or any request to make a site easier for pi, OpenCode, Claude Code, ChatGPT, Perplexity, or other agent harnesses to find and use."
- agent-loopStart, monitor, and cancel durable multi-agent coding loops via Inngest. Use when the user wants to run autonomous coding workloads, execute a PRD with multiple stories, kick off an AFK coding session, have agents implement features from a plan, or manage running loops. Triggers on "start a coding loop", "run this PRD", "implement these stories", "go AFK and code this", "check loop status", "cancel the loop", "joelclaw loop", or any request for autonomous multi-story code execution.
- agent-mail>-
- agent-workloads"Compatibility alias for the canonical `workflow-rig` front door. Use when older prompts mention `agent-workloads` or when you need the legacy workload-planning guidance; for new work, load `workflow-rig` first."
- clawmail>-
- cli-design"Design and build agent-first CLIs with HATEOAS JSON responses, context-protecting output, and self-documenting command trees. Use when creating new CLI tools, adding commands to existing CLIs (joelclaw, slog), or reviewing CLI design for agent-friendliness. Triggers on 'build a CLI', 'add a command', 'CLI design', 'agent-friendly output', or any task involving command-line tool creation."
- codex-prompting"Use this skill for any request to trigger, coordinate, or craft prompts for Codex. Use when user says 'send to codex', 'use codex', 'prompt codex', 'ask codex', 'delegate to codex', 'run in codex', or asks for a Codex-first execution handoff."
- content-publish"Publish content to joelclaw.com via the Convex-first pipeline. Covers the full lifecycle: draft → review → publish → revalidate → verify. Handles secret leasing, tag conventions, content types (article, tutorial, note, essay), and verification gates. Use when: 'write article about X', 'publish article <slug>', 'draft a tutorial', 'publish this', 'push to convex', or any content publishing task."