ai-debug
$
npx mdskill add arcasilesgroup/ai-engineering/ai-debugSystematic debugging skill. Four phases, always in order. NEVER fix symptoms -- always find and fix the root cause. After 2 failed fix attempts, escalate to the user.
SKILL.md
.github/skills/ai-debugView on GitHub ↗
---
name: ai-debug
description: "Diagnoses broken behavior systematically with a 4-phase root-cause loop: test failures, runtime errors, crashes, regressions. Never patches symptoms. Trigger for 'it is not working', 'something broke', 'this used to work', 'I am getting an error', 'CI is failing', 'why is X happening'. Not for adding tests; use /ai-test instead. Not for security findings; use /ai-security instead."
effort: mid
argument-hint: "[error description or file:line]"
mode: agent
model_tier: sonnet
mirror_family: copilot-skills
generated_by: ai-eng sync
canonical_source: .claude/skills/ai-debug/SKILL.md
edit_policy: generated-do-not-edit
---
# Debug
## Purpose
Systematic debugging skill. Four phases, always in order. NEVER fix symptoms -- always find and fix the root cause. After 2 failed fix attempts, escalate to the user.
## When to Use
- Test failures (expected vs actual mismatch)
- Runtime errors (exceptions, crashes, hangs)
- Regressions (worked before, broken now)
- Unexpected behavior (no error, but wrong result)
## Process
Step 0 (load contexts): read `.ai-engineering/manifest.yml` `providers.stacks`; load `.ai-engineering/overrides/<stack>/conventions.md` for each stack and `.ai-engineering/overrides/_shared/conventions.md`; load `.ai-engineering/team/*.md` for team conventions.
### Phase 1: Symptom Analysis (WHAT, WHEN, WHERE)
Gather facts before forming hypotheses:
1. **WHAT**: exact error message, stack trace, log output
2. **WHEN**: always? intermittent? after a specific change? under load?
3. **WHERE**: which file, function, line? which test? which environment?
4. **SINCE WHEN**: `git log --oneline -20` -- what changed recently?
Output: symptom report with all facts classified as KNOWN or SUSPECTED.
### Phase 2: Reproduction (MINIMAL REPRO)
Make the bug reproducible with the smallest possible case:
1. Run the failing test or reproduce the error
2. If not reproducible: document exact conditions and STOP (cannot debug what cannot be reproduced)
3. Strip to minimal repro: remove unrelated code, simplify inputs, isolate the component
4. Confirm: the minimal repro fails consistently
Output: exact command to reproduce the failure.
### Phase 3: Root Cause (WHY)
Apply the 5 Whys to move from symptom to cause:
1. **Why** does it fail? -> [immediate cause]
2. **Why** does that happen? -> [deeper cause]
3. **Why** does that happen? -> [root cause]
(Continue until you reach a cause you can fix directly)
**Techniques** (use as appropriate):
- **Binary search**: comment out code, add assertions to narrow the location
- **Git bisect**: `git bisect start HEAD <known-good>` to find the breaking commit
- **Print tracing**: add targeted print/log statements at decision points
- **Diff analysis**: `git diff <known-good>..HEAD -- <file>` to see what changed
- **Assumption check**: list every assumption the code makes, verify each one
**Classification**: identify the root cause category:
- Logic error (wrong condition, off-by-one, missing case)
- State corruption (mutation, shared state, race condition)
- Contract violation (caller sends wrong type, missing field)
- Environment (missing dependency, wrong version, config)
- Data (unexpected input, encoding, edge case)
Output: root cause statement (1-2 sentences, specific and testable).
### Phase 4: Solution Design (FIX + REGRESSION TEST)
1. **Design the fix**: minimal change that addresses the root cause; one logical change only. If the fix is large, the root cause analysis may be wrong — revisit Phase 3.
2. **Write regression test**: a test that fails without the fix and passes with it
3. **Apply the fix**
4. **Verify**: regression test passes AND all existing tests pass
5. **Check for siblings**: does the same bug pattern exist elsewhere? (`grep` for similar code)
## Escalation Protocol
| Attempt | Action |
|---------|--------|
| 1st fix fails | Try a different approach (not the same thing again) |
| 2nd fix fails | STOP. Escalate to user with: symptom, repro, root cause analysis, 2 approaches tried |
Never retry the same approach. Never loop silently.
## 5 Whys Example
```
Symptom: test_parse_config_handles_empty fails with KeyError
Why 1: config["database"] raises KeyError
Why 2: parse_config returns empty dict when file is empty
Why 3: the YAML parser returns None for empty files, not empty dict
Root cause: missing None -> {} coercion after yaml.safe_load()
Fix: add `config = yaml.safe_load(f) or {}` instead of `config = yaml.safe_load(f)`
```
## Common Mistakes
- Fixing the symptom (e.g., add a try/except) instead of the root cause.
- Not writing a regression test for the fix.
- Changing multiple things at once (change one thing, verify, repeat).
## Stack-specific guidance
spec-133 D-133-10 consolidates stack-specific debug guidance into the
`.ai-engineering/overrides/<stack>/debug.md` files. When debugging a
build / compilation failure, load `overrides/<stack>/debug.md` for the
relevant stack (python, typescript, rust, go, java, kotlin, csharp,
swift, flutter, react-native, php, ruby). Greenfield mode (stacks=[]):
follow the generic procedure above and hint
"add a project file and run `ai-eng doctor --fix`".
## Examples
### Example 1 — failing test with unclear root cause
User: "test_user_signup is failing with 'invalid email format' but the email looks valid"
```
/ai-debug test_user_signup
```
Phase 1 reproduce, Phase 2 hypothesize (regex anchors? trailing whitespace? unicode?), Phase 3 instrument, Phase 4 confirm + add regression test before patching.
### Example 2 — CI failure on a fresh branch
User: "CI is failing only on this branch — what changed?"
```
/ai-debug "CI failing on feat/new-auth"
```
Walks the diff vs `main`, isolates the suspect change, reproduces locally, identifies root cause without symptom-patching.
## Integration
Called by: `/ai-build`, `/ai-build` (test fail), user directly. Calls: test runners (reproduction), `/ai-test` (regression test). Transitions to: `/ai-build` (fix), `/ai-commit` (verified). See also: `/ai-test`, `/ai-postmortem`, `/ai-resolve-conflicts`.
$ARGUMENTS
More from arcasilesgroup/ai-engineering
- ai-adviseProactive governance advisor — checks standards, decisions, and quality trends during development. Always advisory, NEVER blocks. Three modes: `advise` (post-edit), `gate` (pre-dispatch), `drift` (on-demand decision audit). Trigger for 'governance check', 'advise on this change', 'check for drift', 'is this aligned with active decisions', 'shift-left advisory'. Not for blocking gates — use /ai-verify. Not for narrative code review — use /ai-review.
- ai-analyze-permissionsUse when Claude Code keeps asking to approve commands you have already approved, when settings.local.json has grown large, or when you want to consolidate permission grants into wildcard patterns. Trigger for 'too many permission prompts', 'clean up permissions', 'audit my settings', 'consolidate allow rules'. Claude Code only — not available in GitHub Copilot, Antigravity, or Codex.
- ai-animationDesigns motion, transitions, and micro-interactions for UI components: spring animations, gestures, easing, staggers — taste-driven detail compounding. Trigger for 'animate this', 'add transitions', 'micro-interactions for', 'gesture design', 'swipe to dismiss', 'easing for this', 'stagger the'. Not for design systems; use /ai-design instead. Not for visual art; use /ai-visual instead. Not for testing animation code; use /ai-test instead.
- ai-autopilotDelivers large multi-concern specs and backlog runs autonomously: decomposes specs into sub-specs (or normalizes work items into a backlog DAG), deep-plans with parallel agents, builds a dependency DAG, implements in waves, runs a single final quality loop with one bounded quality-remediation pass (verify+guard+review on full changeset), delivers via PR. Trigger for 'implement spec-NNN end to end', 'autopilot this', 'autonomous delivery', 'decompose and ship', 'run the backlog', 'execute these GitHub issues', 'process the sprint backlog'. Invocation is the approval gate. Not for small or single-concern tasks; use /ai-build instead. Not for ambiguous requirements; use /ai-brainstorm first.
- ai-boardOperates the project board (GitHub Projects v2 or Azure DevOps): discovers configuration after install (fields, state mappings, process templates) and syncs work-item state at lifecycle transitions. Trigger for 'set up the board', 'configure our ADO board', 'discover board fields', 'move this issue to in-review', 'update the board', 'mark as in progress', 'sync the work item state'. Two subcommands: `discover` (post-install configuration write) and `sync` (lifecycle state transitions). Auto-invoked via `sync` by /ai-brainstorm, /ai-build, and /ai-pr; fail-open. Not for backlog execution; use /ai-autopilot --backlog instead.
- ai-brainstormForces rigorous design interrogation BEFORE any code: explores approaches, surfaces ambiguity, gathers evidence, produces an approved spec that becomes the contract for /ai-plan. Trigger for 'lets add X', 'how should we handle Y', 'whats the best approach', 'I am thinking about', 'what should we build for'. Not for existing approved specs; use /ai-plan instead. Not for execution; use /ai-build instead.
- ai-branch-cleanupCleans branches safely: switches to the default branch, prunes merged and squash-merged branches, syncs to remote, sweeps stale specs, rotates `.ai-engineering/runtime/` per retention policy. Trigger for 'tidy up', 'tidy branches', 'sync to main', 'delete old branches', 'start fresh', 'rotate runtime'. Auto-invoked by /ai-pr after merge. Not for committing changes; use /ai-commit instead. Not for code-level dead-code removal; use /ai-simplify instead.
- ai-buildCanonical implementation gateway: reads approved plan.md, resolves stack from manifest, deterministic-routes each task to its adapter, dispatches the build agent in an isolated worktree, runs TDD self-validation per task, then a single final quality loop with one bounded quality-remediation pass on the full changeset before /ai-pr. Trigger for 'go', 'start building', 'execute the plan', 'implement it', 'lets do this', 'build the plan', 'resume', 'continue'. Not without an approved plan; run /ai-plan first. Not for multi-concern specs needing decomposition; use /ai-autopilot instead. Not for a single function or subcomponent; use /ai-code.
- ai-codeWrites production code that satisfies stack-context standards on the first pass: interface-first design, backward-compatibility checks, lightweight self-review. Trigger for 'implement this', 'write the code for', 'add X to Y', 'build this function', 'make this work'. Not for tests; use /ai-test instead. Not for debugging; use /ai-debug instead. Not for refactoring; use /ai-simplify instead. Not for executing an approved plan end-to-end; use /ai-build (the gateway).
- ai-commitRuns the governed commit pipeline: auto-branches from protected, stages selectively, formats and lints, scans for secrets, gates docs, composes a conventional message, pushes. Trigger for 'commit my changes', 'save my work', 'push this to remote', 'stage these files', 'ship it'. Not for opening a PR; use /ai-pr instead. Not for branch hygiene; use /ai-branch-cleanup instead.