maintaining-documentation
$
npx mdskill add obra/dotfiles/maintaining-documentationA doc is a set of **claims about reality**; this skill keeps a project's docs matching reality over time. The same law binds every flow: decide whether a doc is even supposed to track current reality (classify first), auto-fix only what is *determinate*, route everything ambiguous to the human, and never rewrite a historical record. **Violating the letter of these rules is violating the spirit of them.** "I was bringing the docs up to date" is exactly how design docs get their history erased.
SKILL.md
.github/skills/maintaining-documentationView on GitHub ↗
---
name: maintaining-documentation
description: Use when documentation needs creating, checking, or maintaining — docs may have drifted from code, pre-release doc verification, updating docs after finishing code work, adding or enforcing project terminology, deciding where a new doc should live, or a routine re-audit of previously verified docs.
---
# Maintaining Documentation
## Overview
A doc is a set of **claims about reality**; this skill keeps a project's docs
matching reality over time. The same law binds every flow: decide whether a
doc is even supposed to track current reality (classify first), auto-fix only
what is *determinate*, route everything ambiguous to the human, and never
rewrite a historical record. **Violating the letter of these rules is
violating the spirit of them.** "I was bringing the docs up to date" is
exactly how design docs get their history erased.
## Dispatch — read the flow file before doing anything
| Situation | You MUST read |
| --- | --- |
| Full audit: "docs are out of date", pre-release, untrusted doc set | `references/audit.md` |
| Routine re-check of previously audited docs | `references/incremental.md` |
| Just finished code work; update the docs that ride along | `references/write-path.md` |
| Create/extend the dictionary; term disputes; exceptions | `references/dictionary.md` |
| Writing a brand-new doc; "where does this doc go?" | `references/new-docs.md` |
These are hard gates: do not start a flow from memory of this table. Flows
add process; the law below binds all of them and is never restated in flow
files.
## STOP: classify before you edit (evergreen vs. point-in-time)
The single biggest failure is rewriting a dated design/plan/spec to "match
the code." A spec from three months ago describing how something *should*
work is **not wrong** when the code later diverged — editing it to match
destroys the record.
- **Evergreen** — README, ABOUT, CLAUDE.md, tutorials, living API/schema
reference. Contract: *reflects current reality*. Drift is a defect → fix.
- **Point-in-time** — design specs, plans, brainstorm notes (often dated, or
under `docs/specs|plans|…`). Contract: *true as of its date*. Drift vs.
current code is **expected**. Never rewrite to match code; at most add a
supersede banner or as-of date.
- **Mixed / unclear** — conflicting signals (a dated-folder `README`) → treat
the whole doc as interview-only.
Classify the doc set **first**, at the **group level** (folder globs + named
exceptions), and **confirm with the human** before editing anything.
Precedence: a point-in-time signal (dated filename, point-in-time directory)
**beats** an evergreen name like `README`. Do **not** decide this per-doc by
gut as you go — surface it as one decision.
**The classification confirmation is the one gate you never skip** — not
under time pressure, not under authority, not under "just be decisive, don't
kick it back to me." A one-line confirm costs seconds; skipping it is what
turns an audit into an erased design record. And when classification is
genuinely ambiguous (e.g. a reference-looking file inside a `specs/` folder
that also holds design docs), **resolve to the safe side — treat it as
point-in-time and confirm** — because a body rewrite is irreversible to the
record. Disclosing your assumption in a footnote *after* you've rewritten the
body is too late.
The confirmed classification **persists in the doc index** (`Class` column —
see Artifacts). Editing flows gate on it: no index → run audit Phase 0 first.
Never classify by gut mid-flow.
## Scope & exclusions
Default target: `README`, top-level `*.md`, `docs/**`, `CLAUDE.md`. **Always
exclude:** git-ignored paths, `.claude/`, `.private-journal/`, worktrees;
**generated / foreign-owned docs** (any "do not edit" / "generated by"
sentinel — editing them is futile, the next regeneration wipes it);
non-markdown. The dictionary (`docs/DICTIONARY.md`) is excluded from its own
terminology sweep.
## The rubric: what may be auto-fixed
Auto-fix a claim **only when ALL hold**:
1. the doc is **evergreen**;
2. the claim is **mechanical** — an identifier/path, CLI flag, config/env
name+default, API route/field, or a token-level example fix;
3. a **single live (non-test) counterpart** exists in the code, so the right
value is *determined, not guessed* (multiple hits / test-only / no hit →
interview);
4. it's a **local token replacement** that leaves the surrounding sentence
true;
5. a missing counterpart is **confirmed removed vs. renamed via git history**
(`git log -S`, `--follow`, blame) before you conclude anything.
**Dictionary clause.** In evergreen prose only, a deprecated synonym may be
auto-fixed when ALL hold: the synonym maps to exactly one dictionary entry
(whose heading is the replacement); the match is whole-word; no exception
covers it; and the replacement leaves the surrounding sentence true. The
determinant is the dictionary instead of a code counterpart — conditions 1, 4
and 5 still apply. Code identifiers, UI strings: never auto-fixed — findings
only (rename / add entry / add exception). Commit messages: dictionary terms
in new ones; history is never flagged.
**Always interview — never auto-fix — regardless of category:**
- **Counts & inventories** ("14 workflows", "~40 files") — no canonical
counting convention.
- **Bare line-number citations** (`file.go:120-130`) — they drift on every
edit. Recommend rewriting to `file:symbol`; never silently renumber.
- **Absence / negative claims** ("there is no env-var fallback") — you can't
grep-prove a negative.
- **Cross-reference repair** — detecting a broken link is fine; choosing its
new target rarely is.
- **Structural changes to embedded examples** — rewriting an example's
*shape* to a new schema.
- **Behavioral / semantic claims** ("does X when Y", sequencing, rationale).
- **Claims whose ground truth lives in another repo or an external binary.**
## Verify the claim, not just the symbol
Confirming that the *thing* a doc names exists is not confirming the *claim
about it*. A claim of the form "X is validated / X happens when Y / X is done
by Z / X is configured as W" is verified only by the **code path that enacts
it** — the validator that rejects, the handler that closes the stream, the
function that computes the value, the line that loads the asset — **not** by
X's mere existence. Check the verb, not just the noun.
- "the loader rejects an emit node that declares `runner`" → find the
rejection in the validator, or the claim is false.
- "the stream closes when the run is terminal" → find the close on *every*
terminal state, or name the one it misses.
- "diffs are computed in `internal/document`" → find the call there, not just
the function's definition.
- "Tailwind is loaded from a CDN" → check the actual `<script src>`, not that
Tailwind is used.
This is the most common false-`matches`: the noun checks out, so the verifier
waves the verb through. State the mechanism, and point the citation at the
code that *does* the thing claimed.
## Artifacts: dictionary and index
The skill maintains exactly **two** artifacts per project, plus inline
stamps. Never a third metadata file.
- **`docs/DICTIONARY.md`** — normative terminology for docs, code
identifiers, commit messages, and UI strings; divergences live in its
Exceptions section, scoped by path globs only. Template:
`templates/DICTIONARY-template.md`; grammar and lifecycle:
`references/dictionary.md`. Evergreen, stamped, and the canonical owner of
terminology.
- **The doc index** — a sentinel-fenced table (`<!-- doc-index:begin/end -->`)
in `docs/README.md`, or standalone `docs/INDEX.md` where no README exists.
Columns: Doc | What | Class | Owns. `Class` is the persisted
classify-and-confirm output; `Owns` is machine-readable path globs (what
`docmaint stale` diffs). Template: `templates/INDEX-template.md`.
- **`docmaint`** (`scan | stamp | stale`, `--help` for usage) does the
mechanical work. It never edits docs; agents do, under the rubric. It lives
in **this skill's** `scripts/` directory — not in the target repo — so
invoke it by absolute path (e.g.
`~/.claude/skills/maintaining-documentation/scripts/docmaint`), and pass
`--root <target-repo>` unless your cwd is already the target repo root
(`--root` defaults to cwd). Flow files write bare `docmaint <subcommand>`;
this resolution rule applies everywhere.
## Stamp contract
Evergreen docs carry one idempotent block at EOF (`docmaint stamp` maintains
it; sentinel `<!-- doc-audit:last-reviewed -->`):
```
---
<!-- doc-audit:last-reviewed -->
_Last reviewed: 2026-06-09 · commit `abc1234` · verified against code (2 claims deferred to review)._
```
The SHA is provenance **and** the incremental cursor — a deliberate change
from the predecessor skill. Incremental re-audits are **triage, not
soundness**: deferred claims keep a doc on the worklist regardless of SHA;
claims with ground truth outside the repo are cleared only by full audits;
terminology never depends on stamps (`docmaint scan` full-sweeps every run).
**Stamping precondition (all flows):** no stamp without independent
verification of the applied edits — full audits use two or more competing
verifiers (Phase 3); diff- or worklist-bounded flows use at least one
independent verifier. Don't stamp point-in-time docs. Stamp only what you
verified — and any evergreen claim you could *not* verify this pass,
including claims whose ground truth lives outside the repo, counts toward
`--deferred N`. Never let an unverified claim ride under a clean stamp.
## Pragmatism law
- Docs exist for readers. Before creating any doc, entry, or artifact, name
the reader. No reader → don't write it.
- Adopt existing structures before imposing new ones.
- Two artifacts per project (dictionary, index) plus stamps. Never a third.
- A coverage gap is a finding to file, not a mandate to generate a doc.
- The dictionary stays readable in one sitting: load-bearing terms only.
- Write-path stays bounded by the diff. If maintenance isn't cheap, it won't
happen.
## Red flags — STOP
- "I'll bring the docs up to date" across a whole spec/plan set **without
classifying first**. → Classify and confirm first.
- About to **renumber** a `file:line` citation. → Convert to `file:symbol`
via interview.
- About to **rewrite an example's structure** to the new schema. →
Interview, not auto-fix.
- Pressured to "be decisive / don't confirm", about to classify docs
yourself and **rewrite their bodies**. → The classification confirm is the
one gate you never skip. Surface it; a one-line yes costs seconds.
- Concluding a feature was "removed" / deleting its docs **from a grep
miss**. → Check git history first.
- "Close enough, I'll just fix the count." → Counts go to the human.
- Editing a file that says "generated" / "do not edit". → Skip it.
- Marking a claim "matches" **without a citation**. → Cite it or don't claim
it.
- About to **finish without a second adversarial pass over your own edits**.
→ In a full audit, run the competing verify pass (audit Phase 3) first; in
bounded flows, the one-verifier stamping precondition applies. Fix what it
finds.
- Marking a claim "matches" because the **symbol exists**, without checking
the code *does* what's claimed about it. → Verify the verb, not the noun.
- Audited each doc, never looked at the **set** (duplication, gaps,
contradictions, index). → Per-doc passes are blind to set-level defects;
run the corpus review (audit Phase 4 — applies to the full-audit flow).
- About to remove a `[permanent]` exception, or a `[temporary]` one on scan
evidence alone. → Permanent: never. Temporary: confirm via git history
first (a grep miss is not resolution).
- About to stamp a doc whose edits nobody independently verified. → The
stamping precondition applies in every flow, not just full audits.
- About to write a doc, or a dictionary entry, no one asked to read. → Name
the reader (pragmatism law).
- About to add a third per-project metadata file. → Two artifacts. Never a
third.
- About to write an exception scope in prose ("code touching X"). → Path
globs only.
- Editing docs in a project with no confirmed index `Class` column. → Run
audit Phase 0 first.
## Rationalizations
| Excuse | Reality |
|--------|---------|
| "The spec says X but the code says Y, so the spec is wrong." | Only if the spec is **evergreen**. A point-in-time design doc is allowed to differ — classify first. |
| "It's obviously a reference doc, I'll just fix it." | Per-doc gut calls are how design docs get their history rewritten. Confirm the classification with the human. |
| "They said be decisive and not to kick decisions back — so I'll classify and rewrite myself." | The classification confirm is the one gate you never skip; it's the difference between fixing a doc and erasing a design record. A one-line confirm costs seconds. |
| "Nothing in it reads as a dated design narrative, so it's evergreen." | Absence of obvious design language isn't proof of evergreen, especially in a mixed folder like `specs/`. Ambiguous + destructive edit → treat as point-in-time and confirm. |
| "I disclosed my assumption at the end." | A footnote after you've rewritten the body is too late — the record is already changed. Confirm *before* editing, not after. |
| "The line number moved, I'll update it." | Line numbers drift on every edit; re-pinning re-breaks it. Convert to `file:symbol`. |
| "I'll bring the example up to the new schema." | Structural example rewrites change meaning → interview, don't auto-fix. |
| "The grep found nothing, the feature's gone." | A grep miss is not proof of removal. Check git history; it may be renamed. |
| "37 vs. ~40 is close, I'll just write the real number." | Counts have no canonical convention — hand it to the human. |
| "The symbol/route/field exists, so the claim checks out." | Existence ≠ the claim. Verify the code path that *enacts* it (the validator/handler/call site), not just that the noun exists. |
| "Every doc verified clean, so the docs are good." | Per-doc accuracy ≠ a healthy set. Duplication, missing docs, and cross-doc contradictions only surface at the corpus level (audit Phase 4). |
| "Scan found zero hits, the exception is resolved." | A grep miss is not resolution — confirm via `git log -S` first; `[permanent]` exceptions are never removed on scan evidence. |
| "It's a tiny doc edit, stamping without a verifier is fine." | The stamp *means* independently verified. No verifier, no stamp. |
| "The docs feel incomplete, I'll add a doc for each gap." | Gaps are findings. A doc with no reader is debt, not coverage. |
## Common mistakes
- Editing before classifying.
- Spawning one subagent per point-in-time doc — batch them; they aren't
claim-verified.
- An unbounded first-run interview backlog — prioritize, defer the tail to a
report.
- Auto-committing the audit — leave it uncommitted for human review.
- Declaring done without adversarially verifying your own edits — your fixes
are a prime hiding spot for confident-but-wrong claims (audit Phase 3).
- Treating per-doc verification as the whole job — duplication, gaps,
cross-doc contradictions, and a missing index are set-level defects
invisible to a per-doc pass (audit Phase 4).
- Verifying the noun, not the verb — confirming a symbol exists instead of
confirming the code does what the doc claims about it.
- Treating `stale` output as proof of cleanliness — it is triage; soundness
comes from full audits.
More from obra/dotfiles
- chronicle|
- e2e-scenario-testingUse when verifying a running application end-to-end through its real interface — a web UI, a CLI, or a TUI — by writing and executing agent-run "scenario cards" against a freshly built instance with falsifiable assertions. Trigger on "test it end to end", "prove the UI actually works", "write/run a scenario", or after a change touches a user-facing surface that unit tests can't fully cover. Not for unit tests, pure code review, or API-only checks.
- roborev-design-reviewRequest a design review for a commit and present the results
- roborev-design-review-branchRequest a design review for all commits on the current branch and present the results
- roborev-fixUse when the user asks to fix open reviews, invokes /roborev-fix, or provides job IDs; do not use when the user only pastes review findings with no request to discover or close reviews
- roborev-refineIterative review-fix loop for the current branch — reviews via daemon, fixes inline, re-reviews until passing or max iterations reached
- roborev-respondAdd a comment to a roborev code review and close it
- roborev-reviewRequest a code review for a commit and present the results
- roborev-review-branchRequest a code review for all commits on the current branch and present the results
- syncing-obsidianUse when reading or writing files in an Obsidian vault that is synced with obsync. Ensures changes are pulled before reading and pushed after writing. Also covers sync status, file history, version restore, and diagnosing sync issues.