dump-channel
$
npx mdskill add terrylica/cc-skills/dump-channelArchive a complete Telegram channel/group/chat to NDJSON + downloaded media files.
SKILL.md
.github/skills/dump-channelView on GitHub ↗
---
name: dump-channel
description: user wants to archive, dump, or back up an entire Telegram channel or chat history to NDJSON with all media files downloaded. Full history.
allowed-tools: Bash, Read, Grep, Glob
---
# Dump Telegram Channel History
Archive a complete Telegram channel/group/chat to NDJSON + downloaded media files.
> **Self-Evolving Skill**: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.
## Preflight
1. Session must exist: `~/.local/share/telethon/<profile>.session`
- If missing, run `/tlg:setup` first
2. User must be subscribed to (or a member of) the target channel/chat
## Usage
```bash
/usr/bin/env bash << 'EOF'
SCRIPT="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/cc-skills/plugins/tlg}/scripts/tg-cli.py"
# Full dump: NDJSON + all media (photos, videos, documents)
uv run --python 3.14 "$SCRIPT" dump @ChannelName ./output/ChannelName
# NDJSON only (skip media downloads — much faster)
uv run --python 3.14 "$SCRIPT" dump @ChannelName ./output/ChannelName --no-media
# Dump by numeric chat ID
uv run --python 3.14 "$SCRIPT" dump -1001234567890 ./output/MyChannel
# Use a different profile
uv run --python 3.14 "$SCRIPT" -p missterryli dump @ChannelName ./output/ChannelName
EOF
```
## Parameters
| Parameter | Type | Description |
| ------------ | ---------- | ---------------------------------------------------------- |
| chat | string/int | Channel username (@name) or numeric chat ID |
| output | path | Output directory (messages.ndjson + media/ created inside) |
| `--no-media` | flag | Skip media downloads, produce NDJSON only |
## Output Structure
```
output/ChannelName/
├── messages.ndjson ← one JSON object per line, chronological (oldest first)
└── media/
├── 6.jpg ← named by message ID for cross-referencing
├── 12.png
├── 45.mp4
└── ...
```
## NDJSON Record Schema
Each line is a JSON object with these fields:
| Field | Type | Description |
| ----------------- | ----------- | ----------------------------------------------- |
| `id` | int | Telegram message ID |
| `date` | string | ISO 8601 timestamp with timezone |
| `text` | string/null | Full message text (no truncation) |
| `has_media` | bool | Whether message contains media |
| `media_type` | string/null | Telethon class name (MessageMediaPhoto, etc.) |
| `media_file` | string/null | Filename in media/ dir (e.g., "6.jpg") |
| `views` | int/null | View count (channels only) |
| `forwards` | int/null | Forward count |
| `reply_to_msg_id` | int/null | Parent message ID if reply |
| `grouped_id` | int/null | Album group ID (shared across album messages) |
| `edit_date` | string/null | ISO 8601 timestamp of last edit |
| `sender.id` | int | Sender's Telegram user/channel ID |
| `sender.name` | string | Display name (channel title or user first name) |
| `sender.username` | string/null | @username if set |
## Resume Support
Re-running the same command skips already-downloaded media files (checks `dest.exists()`). The NDJSON is fully rewritten each run. This makes it safe to resume interrupted downloads.
## Querying the Output
```bash
# jq: find all GOLD BUY signals with chart screenshots
jq 'select(.text != null and (.text | test("GOLD.*BUY")) and .media_file != null)' messages.ndjson
# DuckDB: aggregate by date
duckdb -c "SELECT date::DATE as day, count(*) FROM read_ndjson('messages.ndjson') GROUP BY day ORDER BY day"
# Python/Polars
import polars as pl
df = pl.read_ndjson("messages.ndjson")
```
## Performance Notes
- ~3000 messages + 1700 media files takes ~3-5 minutes
- Telegram may briefly disconnect mid-download (`Server closed the connection`) — Telethon auto-reconnects
- For very large channels (10k+ messages), expect 10-15 minutes with media
## Recommended Storage Pattern
For git-tracked projects, gitignore the media folder:
```gitignore
# data/telegram/.gitignore
*/media/
```
This keeps the NDJSON (metadata) in version control while keeping large media files local-only.
## Anti-Patterns
- **Don't dump channels you're not subscribed to** — Telethon needs access via your account
- **Don't run multiple dumps concurrently on the same profile** — session file contention
## Post-Execution Reflection
After this skill completes, check before closing:
1. **Did the command succeed?** — If not, fix the instruction or error table that caused the failure.
2. **Did parameters or output change?** — If tg-cli.py's interface drifted, update Usage examples and Parameters table to match.
3. **Was a workaround needed?** — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.
Only update if the issue is real and reproducible — not speculative.
More from terrylica/cc-skills
- academic-pdf-to-gfmConvert academic PDF papers to GitHub-renderable GFM markdown with math equations. TRIGGERS - PDF, GitHub markdown, math
- adaptive-wfo-epochAdaptive epoch selection for Walk-Forward Optimization. TRIGGERS - WFO epoch, epoch selection, WFE optimization, overfitting epochs.
- adr-code-traceabilityAdd ADR references to code for traceability. TRIGGERS - ADR traceability, code reference, document decision in code.
- adr-graph-easy-architectASCII architecture diagrams for ADRs via graph-easy. TRIGGERS - ADR diagram, architecture diagram, ASCII diagram.
- agent-reach>
- agentic-process-monitorMonitor background processes from Claude Code using sentinel files, heartbeat liveness, and subagent polling. Best practices and.
- alpha-forge-preshipAlpha Forge quality gates for PR review - RNG determinism, URL validation, parameter validation, manifest sync.
- article-extractorExtract MQL5 articles and documentation. TRIGGERS - MQL5 articles, MetaTrader docs, mql5.com resources.
- ascii-diagram-validatorValidate ASCII diagram alignment in markdown. TRIGGERS - diagram alignment, ASCII art, box-drawing diagrams.
- asciinema-analyzerSemantic analysis of asciinema recordings. TRIGGERS - analyze cast, keyword extraction, find patterns in recordings.