ai-postmortem
$
npx mdskill add arcasilesgroup/ai-engineering/ai-postmortemGenerate structured incident reports using the DERP framework.
- Creates blameless documentation for production outages and near-misses.
- Integrates with alert systems and recent commit history for context.
- Advances through detection, escalation, recovery, and prevention phases.
- Outputs formatted markdown files to the engineering postmortems directory.
SKILL.md
.github/skills/ai-postmortemView on GitHub ↗
---
name: ai-postmortem
description: Documents production incidents, outages, degradations, and near-misses using the DERP format (Detection, Escalation, Recovery, Prevention) with targeted interview questions per phase. Trigger for 'we had an incident', 'write up the outage', 'something went wrong in prod', 'postmortem', 'near-miss analysis', 'incident report'. Not for customer support investigations; use /ai-support instead. Not for internal dev bugs; use /ai-debug instead.
effort: mid
argument-hint: "start|continue [id]|find [query]|generate"
mode: agent
model_tier: sonnet
mirror_family: copilot-skills
generated_by: ai-eng sync
canonical_source: .claude/skills/ai-postmortem/SKILL.md
edit_policy: generated-do-not-edit
---
# Postmortem
## Purpose
Structured incident postmortem using the DERP model. Guides through Detection, Escalation, Recovery, and Prevention phases with targeted questions. Produces blameless, actionable postmortem documents.
## Trigger
- Command: `/ai-postmortem start|continue|find|generate`
- Context: incident occurred, production failure, outage resolution, near-miss analysis.
## Workflow
Four modes follow the DERP loop:
1. `start` — scaffold a new postmortem under `.ai-engineering/postmortems/PM-YYYY-NNN.md` from the template; pose Detection-phase questions.
2. `continue <id>` — resume an in-progress postmortem; advance through Escalation → Recovery → Prevention.
3. `generate` — synthesize from existing session context (recent commits, alerts, logs).
4. `find [query]` — search past postmortems for similar patterns.
## Modes
### start -- New postmortem
1. **Assign ID** -- generate `PM-YYYY-NNN` (sequential within year).
2. **Set status** -- `draft`.
3. **Scaffold** -- create `.ai-engineering/postmortems/{id}.md` with DERP template.
4. **Interview -- Detection**:
- When was the incident first detected?
- How was it detected? (monitoring, user report, manual discovery)
- What was the time between incident start and detection?
- What monitoring should have caught it earlier?
5. **Interview -- Escalation**:
- Who was notified and when?
- Was the escalation path appropriate?
- Were the right people involved at the right time?
6. **Interview -- Recovery**:
- What actions were taken to restore service?
- What was the total downtime/impact duration?
- Was there a rollback? What was the rollback procedure?
7. **Interview -- Prevention**:
- Root cause (use 5-Whys if needed)
- What changes prevent recurrence?
- Action items with owners and deadlines
Ask ONE section at a time. Wait for answers before proceeding to the next DERP phase. If the user provides answers for multiple DERP phases at once (e.g., pastes an incident timeline), extract the relevant information into each section rather than re-asking. Only interview for sections that remain incomplete.
### continue <id> -- Resume postmortem
1. **Load** -- read `.ai-engineering/postmortems/{id}.md`.
2. **Find gap** -- identify the first incomplete DERP section.
3. **Resume interview** -- continue from the incomplete section.
### find [query] -- Search postmortems
1. **Search** -- scan `.ai-engineering/postmortems/*.md` for matching content.
2. **List** -- show ID, title, date, status, and root cause summary.
### generate -- Create from existing notes
1. **Collect** -- gather incident-related commits, PRs, Slack threads (if provided by the user as pasted text), and notes from context.
2. **Draft** -- populate DERP sections from available data, mark gaps as `[NEEDS INPUT]`.
3. **Review** -- present draft for user validation before saving.
## Document Template
```markdown
# {id}: {title}
**Date**: YYYY-MM-DD
**Status**: draft | in-review | complete
**Severity**: SEV-1 | SEV-2 | SEV-3
**Duration**: {total impact time}
## Detection
{How and when the incident was discovered}
## Escalation
{Notification chain and response timeline}
## Recovery
{Steps taken to restore service}
## Prevention
### Root Cause
{5-Whys analysis}
### Action Items
| # | Action | Owner | Deadline | Status |
|---|--------|-------|----------|--------|
## Timeline
| Time | Event |
|------|-------|
```
## Status Progression
`draft` -> `in-review` (all DERP sections complete) -> `complete` (action items assigned)
## Quick Reference
```
/ai-postmortem start # begin new postmortem
/ai-postmortem continue PM-2026-001 # resume in-progress postmortem
/ai-postmortem find database # search past postmortems
/ai-postmortem generate # generate from existing context
```
Storage: `.ai-engineering/postmortems/{id}.md` (ID format `PM-YYYY-NNN`).
## Examples
### Example 1 — start a new postmortem after an outage
User: "we had an outage in checkout this morning, write it up"
```
/ai-postmortem start
```
Scaffolds `PM-2026-001.md`, asks Detection-phase interview questions (when/how detected, time-to-page, alert quality), advances to Escalation when answered.
### Example 2 — generate from session context
User: "synthesize a near-miss postmortem from this debug session"
```
/ai-postmortem generate
```
Reads recent debug session, alerts, commits; drafts DERP sections with TBD markers where data is incomplete.
## Integration
Called by: user directly. Reads: alerts, runbooks, `/ai-debug` outputs, recent commits. Writes: `.ai-engineering/postmortems/PM-YYYY-NNN.md`. See also: `/ai-debug` (root cause), `/ai-support` (customer-facing), `/ai-learn` (post-incident lessons).
$ARGUMENTS
More from arcasilesgroup/ai-engineering
- ai-adviseProactive governance advisor — checks standards, decisions, and quality trends during development. Always advisory, NEVER blocks. Three modes: `advise` (post-edit), `gate` (pre-dispatch), `drift` (on-demand decision audit). Trigger for 'governance check', 'advise on this change', 'check for drift', 'is this aligned with active decisions', 'shift-left advisory'. Not for blocking gates — use /ai-verify. Not for narrative code review — use /ai-review.
- ai-analyze-permissionsUse when Claude Code keeps asking to approve commands you have already approved, when settings.local.json has grown large, or when you want to consolidate permission grants into wildcard patterns. Trigger for 'too many permission prompts', 'clean up permissions', 'audit my settings', 'consolidate allow rules'. Claude Code only — not available in GitHub Copilot, Antigravity, or Codex.
- ai-animationDesigns motion, transitions, and micro-interactions for UI components: spring animations, gestures, easing, staggers — taste-driven detail compounding. Trigger for 'animate this', 'add transitions', 'micro-interactions for', 'gesture design', 'swipe to dismiss', 'easing for this', 'stagger the'. Not for design systems; use /ai-design instead. Not for visual art; use /ai-visual instead. Not for testing animation code; use /ai-test instead.
- ai-autopilotDelivers large multi-concern specs and backlog runs autonomously: decomposes specs into sub-specs (or normalizes work items into a backlog DAG), deep-plans with parallel agents, builds a dependency DAG, implements in waves, runs a single final quality loop with one bounded quality-remediation pass (verify+guard+review on full changeset), delivers via PR. Trigger for 'implement spec-NNN end to end', 'autopilot this', 'autonomous delivery', 'decompose and ship', 'run the backlog', 'execute these GitHub issues', 'process the sprint backlog'. Invocation is the approval gate. Not for small or single-concern tasks; use /ai-build instead. Not for ambiguous requirements; use /ai-brainstorm first.
- ai-boardOperates the project board (GitHub Projects v2 or Azure DevOps): discovers configuration after install (fields, state mappings, process templates) and syncs work-item state at lifecycle transitions. Trigger for 'set up the board', 'configure our ADO board', 'discover board fields', 'move this issue to in-review', 'update the board', 'mark as in progress', 'sync the work item state'. Two subcommands: `discover` (post-install configuration write) and `sync` (lifecycle state transitions). Auto-invoked via `sync` by /ai-brainstorm, /ai-build, and /ai-pr; fail-open. Not for backlog execution; use /ai-autopilot --backlog instead.
- ai-brainstormForces rigorous design interrogation BEFORE any code: explores approaches, surfaces ambiguity, gathers evidence, produces an approved spec that becomes the contract for /ai-plan. Trigger for 'lets add X', 'how should we handle Y', 'whats the best approach', 'I am thinking about', 'what should we build for'. Not for existing approved specs; use /ai-plan instead. Not for execution; use /ai-build instead.
- ai-branch-cleanupCleans branches safely: switches to the default branch, prunes merged and squash-merged branches, syncs to remote, sweeps stale specs, rotates `.ai-engineering/runtime/` per retention policy. Trigger for 'tidy up', 'tidy branches', 'sync to main', 'delete old branches', 'start fresh', 'rotate runtime'. Auto-invoked by /ai-pr after merge. Not for committing changes; use /ai-commit instead. Not for code-level dead-code removal; use /ai-simplify instead.
- ai-buildCanonical implementation gateway: reads approved plan.md, resolves stack from manifest, deterministic-routes each task to its adapter, dispatches the build agent in an isolated worktree, runs TDD self-validation per task, then a single final quality loop with one bounded quality-remediation pass on the full changeset before /ai-pr. Trigger for 'go', 'start building', 'execute the plan', 'implement it', 'lets do this', 'build the plan', 'resume', 'continue'. Not without an approved plan; run /ai-plan first. Not for multi-concern specs needing decomposition; use /ai-autopilot instead. Not for a single function or subcomponent; use /ai-code.
- ai-codeWrites production code that satisfies stack-context standards on the first pass: interface-first design, backward-compatibility checks, lightweight self-review. Trigger for 'implement this', 'write the code for', 'add X to Y', 'build this function', 'make this work'. Not for tests; use /ai-test instead. Not for debugging; use /ai-debug instead. Not for refactoring; use /ai-simplify instead. Not for executing an approved plan end-to-end; use /ai-build (the gateway).
- ai-commitRuns the governed commit pipeline: auto-branches from protected, stages selectively, formats and lints, scans for secrets, gates docs, composes a conventional message, pushes. Trigger for 'commit my changes', 'save my work', 'push this to remote', 'stage these files', 'ship it'. Not for opening a PR; use /ai-pr instead. Not for branch hygiene; use /ai-branch-cleanup instead.