mapping-codebases

$npx mdskill add oaustegard/claude-skills/mapping-codebases

Map unfamiliar codebases instantly using AST analysis.

  • Creates navigable maps showing functions, classes, and line numbers.
  • Depends on tree-sitter parsers for 11 programming languages.
  • Executes on commands like map this codebase or explore repo.
  • Outputs _MAP.md files per directory for quick structure review.

SKILL.md

.github/skills/mapping-codebasesView on GitHub ↗
---
name: mapping-codebases
description: Generate navigable code maps for unfamiliar codebases. Extracts exports/imports via AST (tree-sitter) to create _MAP.md files per directory showing classes, functions, methods with signatures and line numbers. Use when exploring repositories, understanding project structure, analyzing unfamiliar code, or before modifications. Triggers on "map this codebase", "explore repo", "understand structure", "what does this project contain", or when starting work on an unfamiliar repository.
metadata:
  version: 0.8.0
---

# Mapping Codebases

Generate `_MAP.md` files providing hierarchical code structure views. Maps show function signatures, class methods, imports, and line numbers—enabling API understanding without reading full source files.

## Installation

```bash
uv venv /home/claude/.venv
uv pip install tree-sitter-language-pack --python /home/claude/.venv/bin/python
```

**Bundled parsers**: The skill includes pre-built parsers for all 11 supported languages (Python, JavaScript, TypeScript, TSX, Go, Rust, Ruby, Java, C, HTML, Markdown) in `parsers/`. These are used automatically when the tree-sitter-language-pack runtime download fails (e.g., in proxied environments). No network access needed for supported languages.

## Generate Maps

```bash
/home/claude/.venv/bin/python /mnt/skills/user/mapping-codebases/scripts/codemap.py /path/to/repo --skip tests,.github,locale
```

Common skip patterns: `tests,.github,.husky,locale,migrations,__snapshots__,coverage,target,docs`

## Navigate Via Maps

After generating maps, use them for navigation—read `_MAP.md` files, not source files directly.

**Workflow:**
1. Read root `_MAP.md` for high-level structure
2. Follow subdirectory links to drill into relevant areas
3. Use function signatures and class methods to understand APIs
4. Read full source only when implementation details are needed

**Maps reveal without reading source:**
- All public functions with signatures: `def record_attempt(subtask_id, session, success, approach, error=None)` :200
- Class structure with methods: `RecoveryManager` → `classify_failure()`, `is_circular_fix()`, `rollback_to_commit()`
- Import relationships showing module dependencies
- Line numbers for direct navigation

**Anti-pattern:** Reading files directory-by-directory. Use maps to find what you need, then read only the specific file/lines required.

## Map Contents

Each `_MAP.md` includes:
- Directory statistics (file/subdirectory counts)
- Subdirectory links for hierarchical navigation
- Per-file: imports preview, symbol hierarchy with (C)lass/(m)ethod/(f)unction markers
- Function signatures (Python, TypeScript, Go, Rust, Ruby, C)
- Line ranges (`:42-85` format) showing symbol start and end lines
- Doc comments — first-line summaries from docstrings, JSDoc, Doxygen, `///`, `#` comments
- Constants and defines (`pub const`, `#define`, `export const`, Go `const`)
- Enum variants as children of enum symbols (C)
- Markdown files: h1/h2 heading ToC
- Other files section (JSON, YAML, configs)

Example:
```markdown
# services/
*Files: 4 | Subdirectories: 0*

## Files

### recovery.py
> Imports: `json, subprocess, dataclasses, datetime, enum`...
- **FailureType** (C) :24-38
- **RecoveryManager** (C) :43-510 — Manages error recovery strategies.
  - **__init__** (m) `(self, spec_dir: Path, project_dir: Path)` :55-72
  - **classify_failure** (m) `(self, error: str, subtask_id: str)` :137-198 — Classify an error into a FailureType.
  - **record_attempt** (m) `(subtask_id, session, success, approach, error=None)` :200-240
  - **is_circular_fix** (m) `(self, subtask_id: str, current_approach: str)` :242-280
  - **get_recovery_hints** (m) `(self, subtask_id: str)` :495-510
```

## Commands

```bash
# Generate maps
/home/claude/.venv/bin/python /mnt/skills/user/mapping-codebases/scripts/codemap.py /path/to/repo

# Skip directories
/home/claude/.venv/bin/python /mnt/skills/user/mapping-codebases/scripts/codemap.py /path/to/repo --skip tests,.github

# Clean maps
/home/claude/.venv/bin/python /mnt/skills/user/mapping-codebases/scripts/codemap.py /path/to/repo --clean

# Dry run
/home/claude/.venv/bin/python /mnt/skills/user/mapping-codebases/scripts/codemap.py /path/to/repo -n

# Verbose (debug output)
/home/claude/.venv/bin/python /mnt/skills/user/mapping-codebases/scripts/codemap.py /path/to/repo -v
```

Default skips: `.git`, `node_modules`, `__pycache__`, `.venv`, `venv`, `dist`, `build`, `.next`

## Supported Languages

Python, JavaScript, TypeScript, TSX, Go, Rust, Ruby, Java, C, HTML, Markdown.

## Limitations

- Doc comments extracted as first-line summaries (not full descriptions)
- Signatures: Python (full), TypeScript/Go/Rust/Ruby/C (params + return types), Java (not extracted)
- Markdown: h1/h2 headings only
- Private symbols (`_prefix`) excluded from top-level exports

More from oaustegard/claude-skills

SkillDescription
accessing-github-reposGitHub repository access in containerized environments using REST API and credential detection. Use when git clone fails, or when accessing private repos/writing files via API.
api-credentialsSecurely manages API credentials for multiple providers (Anthropic Claude, Google Gemini, GitHub). Use when skills need to access stored API keys for external service invocations.
asking-questionsGuidance for asking clarifying questions when user requests are ambiguous, have multiple valid approaches, or require critical decisions. Use when implementation choices exist that could significantly affect outcomes.
browsing-blueskyBrowse Bluesky content via API and firehose - search posts, fetch user activity, sample trending topics, read feeds and lists, analyze and categorize accounts. Supports authenticated access for personalized feeds. Use for Bluesky research, user monitoring, trend analysis, feed reading, firehose sampling, account categorization.
building-github-indexGenerate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".
categorizing-bsky-accountsAnalyze and categorize Bluesky accounts by topic using keyword extraction. Use when users mention Bluesky account analysis, following/follower lists, topic discovery, account curation, or network analysis.
chartingSelect the right Python charting library (seaborn, matplotlib, graphviz) and produce publication-quality static visualizations. Use when creating charts, plots, graphs, diagrams, heatmaps, visualizations from data, or when choosing between matplotlib/seaborn/graphviz. Also triggers for network diagrams, flowcharts, dependency trees, state machines, and entity-relationship diagrams. For interactive browser-rendered charts or uploaded data exploration, defer to charting-vega-lite instead.
charting-vega-liteCreate interactive data visualizations using Vega-Lite declarative JSON grammar. Supports 20+ chart types (bar, line, scatter, histogram, boxplot, grouped/stacked variations, etc.) via templates and programmatic builders. Use when users upload data for charting, request specific chart types, or mention visualizations. Produces portable JSON specs with inline data islands that work in Claude artifacts and can be adapted for production.
check-toolsValidates development tool installations across Python, Node.js, Java, Go, Rust, C/C++, Git, and system utilities. Use when verifying environments or troubleshooting dependencies.
cloning-projectExports project instructions and knowledge files from the current Claude project. Use when users want to clone, copy, backup, or export a project's configuration and files.