scan
$
npx mdskill add wshobson/agents/scanScans codebase to generate and update project documentation for agent-driven workflows
- Solves the problem of maintaining up-to-date project and agent documentation
- Uses understand-anything and context-mode plugins when available, otherwise native tools
- Runs full scan initially, then smart delta scans to detect architectural drift
- Updates AGENTS.md only with human confirmation when changes are detected
SKILL.md
.github/skills/scanView on GitHub ↗
---
name: scan
description: Scans the codebase to generate project-doc.md and AGENTS.md. Use when bootstrapping a new agent-driven repo, refreshing project documentation after architectural changes, or running a delta scan to detect drift. Runs a full scan on first use and a smart delta scan on subsequent runs. Uses understand-anything + context-mode when available, falls back to native tools otherwise. Only updates AGENTS.md on detected architectural changes with human confirmation.
---
# Codebase Scanner
You are a technical analyst. Your job is to scan the project codebase and produce accurate, project-specific documentation used by all downstream agents.
## Step 1: Check Optional Plugin Dependencies
Check whether the two optional enhancement plugins are available:
```
understand-anything → /plugin list | grep understand-anything
context-mode → /plugin list | grep context-mode
```
These plugins are **optional**. They improve scan quality but are not required:
- **understand-anything** (Lum1104/Understand-Anything) — provides deeper semantic code analysis
- **context-mode** (mksglu/context-mode) — routes large outputs through a sandbox to protect the context window
If both are present, use them in Steps 3–4 as described below. If either or both are missing, proceed with the **native fallback** approach: use `find`, `grep`, `cat`, and `git` commands directly, routing large outputs through `ctx_execute` / `ctx_execute_file` if context-mode is available, otherwise summarise inline.
> **Note:** To install the optional plugins manually:
> ```
> /plugin marketplace add Lum1104/Understand-Anything && /plugin install understand-anything
> /plugin marketplace add mksglu/context-mode && /plugin install context-mode@context-mode
> ```
## Step 2: Determine Scan Mode
Check if `.claude/pipeline/project-doc.md` exists.
- **Does not exist** → FULL SCAN (first run)
- **Exists** → DELTA SCAN
## Step 3A: Full Scan
Use `understand-anything` to analyse the entire codebase. If **context-mode** is available (verified in Step 1), route ALL output through its tools (`ctx_batch_execute` / `ctx_execute_file`) — never dump raw file contents into the main context window. If context-mode is not available, summarise each file's findings inline and avoid printing raw file contents.
Produce `.claude/pipeline/project-doc.md` using the following structure (based on the architecture-blueprint-generator pattern):
```md
# Project Documentation
> Generated: [timestamp] | Mode: FULL
## Tech Stack
- Runtime: [e.g. Node.js 20, Python 3.11]
- Language: [e.g. TypeScript, Python]
- Framework: [e.g. Next.js 14 App Router, FastAPI]
- Database: [e.g. PostgreSQL via Prisma]
- Styling: [e.g. Tailwind CSS]
- State Management: [e.g. Zustand, Redux]
## Dependencies
[Key libraries with versions, grouped by: core / dev / testing]
## Architecture Pattern
[e.g. Feature-based, Layered MVC, Clean Architecture]
[Describe how the project is structured and why]
## Folder Structure
[Top-level directory map with purpose of each folder]
## Code Style Conventions
[Naming patterns, file naming, import ordering, export patterns]
[Inferred from actual code — not guessed]
## Modularity Practices
[How concerns are separated, shared module locations, service patterns]
## Data Architecture
[Entity relationships, data access patterns, ORM usage]
## Cross-Cutting Concerns
[Auth/authz approach, error handling patterns, logging, validation]
## Service Communication
[REST / GraphQL / event-driven — document what actually exists]
## Test Coverage
- Overall coverage: [X%]
- Testing framework: [e.g. Jest, Vitest, Pytest]
- Key untested areas: [list]
- Test patterns used: [unit / integration / e2e]
## Entry Points
[Main files, key config files, environment setup]
## Changed Files
[Only present in delta scans — list of files re-scanned]
## Last Scanned
[ISO timestamp]
```
After writing `project-doc.md`, proceed to **Step 4** to generate `AGENTS.md`.
## Step 3B: Delta Scan
1. Run `git diff HEAD~1 --name-only` to get changed files
2. If no changed files, report "No changes detected — project-doc.md is current" and exit
3. Use `understand-anything` to re-analyse only the changed files; route output through `ctx_execute_file` if context-mode is available, otherwise summarise inline
4. Patch only the affected sections of `.claude/pipeline/project-doc.md`
5. Update the `Last Scanned` and `Changed Files` fields
6. Proceed to **Step 4B** (architectural change detection)
## Step 4A: Generate AGENTS.md (First Run Only)
Write `AGENTS.md` to the repo root. This is NOT a copy of `project-doc.md` — it is rewritten as agent instructions, tailored to this specific project. Every agent reads this file first.
Structure:
```md
# AGENTS.md — [Project Name]
> Auto-generated by the dev pipeline scanner. Do not edit manually.
> Last updated: [timestamp]
> ⚠️ To update this file, architectural changes must be detected by the scanner and confirmed by a human.
## How to Read This File
Every agent in this pipeline reads this file before doing any work.
It defines the rules, patterns, and guardrails specific to this project.
## Stack Context
[One-line summary: e.g. "Next.js 14 App Router + Prisma + PostgreSQL + Tailwind + Vitest"]
## Code Style Rules
[Written as DO/DON'T instructions inferred from actual codebase patterns]
Example:
- DO use named exports. Default exports are not used in this project.
- DON'T add business logic to API route handlers — delegate to /lib/services/
- DO use [naming convention] for [file type]
## Architecture Guardrails
[Rules derived from the actual architecture — not generic advice]
Example:
- This project uses the Repository pattern. Never query the DB directly from components.
- All API responses must go through the [ResponseWrapper] utility.
## Testing Requirements
[Coverage stat + specific rules for this project]
Example:
- Current coverage: 67%. All new code must include unit tests.
- QA agent: flag any feature with <80% coverage on new code.
- Integration tests use [real DB / mock DB] — do not change this.
## Modularity Conventions
[Specific rules about where code goes]
Example:
- Shared UI components → /components/ui
- Business logic → /lib/services/[domain]/
- Types → /types/[domain].ts
## Security Rules (All Agents)
- Never hardcode secrets, tokens, or credentials
- Use environment variables for all sensitive config
- Flag any auth-adjacent code changes immediately
## Agent-Specific Instructions
### Orchestrator
[Project-specific questions to always ask — e.g. "Does this touch the payment flow?"]
### Architect
[Known complexity areas, performance constraints, patterns to prefer]
[e.g. "This project has a known N+1 issue in /lib/services/orders — avoid adding more eager loading"]
### Developer
[Specific libraries to use, anti-patterns banned in this codebase]
[e.g. "Use dayjs — moment is banned", "Use React Query for all data fetching — no raw fetch()"]
### PR Reviewer
[What counts as 🔴 Critical vs 🟡 Should Fix in this project]
[e.g. "Any change to /lib/auth/ is automatically 🔴 Critical — requires human approval"]
### QA Agent
[Known edge cases for this domain, critical user paths to always test]
[e.g. "Always test empty state, loading state, and error state for every UI feature"]
```
If the project is MERN stack (MongoDB + Express + React + Node.js — detected from package.json / requirements), append a `### MERN Stack Notes` section to AGENTS.md covering: use Mongoose middleware over raw queries, handle async errors in Express with a central error handler, avoid storing JWT tokens in localStorage (use httpOnly cookies), and never expose Mongoose error objects directly in API responses.
## Step 4B: Architectural Change Detection (Delta Runs Only)
After patching `project-doc.md`, compare the new version against the previous. Check for:
- New framework or major library added
- New architectural directory pattern created (e.g. new `/lib/hooks/`, `/services/`)
- Major dependency swap (e.g. axios → fetch, moment → dayjs)
- New auth or session handling pattern
If any detected, show:
```
⚠️ Architectural change detected in delta scan:
[List specific changes found]
AGENTS.md may need updating. Review and confirm:
[y] Update AGENTS.md — patch affected sections only
[n] Skip — this is not an architectural change
```
Only on `[y]` confirmation: patch the relevant sections of `AGENTS.md`. Never rewrite the full file.
## Step 5: Report
Print a summary:
```
✅ Scan complete ([FULL/DELTA])
project-doc.md → updated
AGENTS.md → [generated / patched / unchanged]
Changed files → [N files re-scanned / N/A for full scan]
Coverage → [X%]
```
Update `state.json` field `checkpoints.scan = "completed"`.
More from wshobson/agents
- accessibility-complianceImplement WCAG 2.2 compliant interfaces with mobile accessibility, inclusive design patterns, and assistive technology support. Use when auditing accessibility, implementing ARIA patterns, building for screen readers, or ensuring inclusive user experiences.
- airflow-dag-patternsBuild production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.
- angular-migrationMigrate from AngularJS to Angular using hybrid mode, incremental component rewriting, and dependency injection updates. Use when upgrading AngularJS applications, planning framework migrations, or modernizing legacy Angular code.
- anti-reversing-techniquesUnderstand anti-reversing, obfuscation, and protection techniques encountered during software analysis. Use this skill when analyzing malware evasion techniques, when implementing anti-debugging protections for CTF challenges, when reverse engineering packed binaries, or when building security research tools that need to detect virtualized environments.
- api-design-principlesMaster REST and GraphQL API design principles to build intuitive, scalable, and maintainable APIs that delight developers. Use when designing new APIs, reviewing API specifications, or establishing API design standards.
- architecture-decision-recordsWrite and maintain Architecture Decision Records (ADRs) following best practices for technical decision documentation. Use when documenting significant technical decisions, reviewing past architectural choices, or establishing decision processes.
- architecture-patternsImplement proven backend architecture patterns including Clean Architecture, Hexagonal Architecture, and Domain-Driven Design. Use this skill when designing clean architecture for a new microservice, when refactoring a monolith to use bounded contexts, when implementing hexagonal or onion architecture patterns, or when debugging dependency cycles between application layers.
- async-python-patternsMaster Python asyncio, concurrent programming, and async/await patterns for high-performance applications. Use when building async APIs, concurrent systems, or I/O-bound applications requiring non-blocking operations.
- attack-tree-constructionBuild comprehensive attack trees to visualize threat paths. Use when mapping attack scenarios, identifying defense gaps, or communicating security risks to stakeholders.
- auth-implementation-patternsMaster authentication and authorization patterns including JWT, OAuth2, session management, and RBAC to build secure, scalable access control systems. Use when implementing auth systems, securing APIs, or debugging security issues.