ai-autopilot
$
npx mdskill add arcasilesgroup/ai-engineering/ai-autopilotAutonomous execution of large approved specs via a 6-phase pipeline. Decomposes a spec into N focused sub-specs, deep-plans each with parallel agents, orchestrates a dependency-aware execution DAG, implements in waves, runs a single final verify+guard+review pass with one bounded quality-remediation pass on the full changeset, and delivers the full changeset via PR with a transparency report. Execution is not complete until sub-spec work converges into a final delivery PR against protected main. One invocation, zero interruptions, full disclosure.
SKILL.md
.github/skills/ai-autopilotView on GitHub ↗
--- name: ai-autopilot description: "Delivers large multi-concern specs and backlog runs autonomously: decomposes specs into sub-specs (or normalizes work items into a backlog DAG), deep-plans with parallel agents, builds a dependency DAG, implements in waves, runs a single final quality loop with one bounded quality-remediation pass (verify+guard+review on full changeset), delivers via PR. Trigger for 'implement spec-NNN end to end', 'autopilot this', 'autonomous delivery', 'decompose and ship', 'run the backlog', 'execute these GitHub issues', 'process the sprint backlog'. Invocation is the approval gate. Not for small or single-concern tasks; use /ai-build instead. Not for ambiguous requirements; use /ai-brainstorm first." effort: high argument-hint: "'implement spec-NNN'|--backlog --source <github|ado|local>|--resume|--no-watch" mode: agent tags: [orchestration, autonomous, multi-spec, backlog, pipeline, execution, dag, transparency] model_tier: opus mirror_family: copilot-skills generated_by: ai-eng sync canonical_source: .claude/skills/ai-autopilot/SKILL.md edit_policy: generated-do-not-edit --- # Autopilot v2 ## Purpose Autonomous execution of large approved specs via a 6-phase pipeline. Decomposes a spec into N focused sub-specs, deep-plans each with parallel agents, orchestrates a dependency-aware execution DAG, implements in waves, runs a single final verify+guard+review pass with one bounded quality-remediation pass on the full changeset, and delivers the full changeset via PR with a transparency report. Execution is not complete until sub-spec work converges into a final delivery PR against protected main. One invocation, zero interruptions, full disclosure. ## When to Use - Spec has >= 3 independent concerns or touches >= 10 files - After `/ai-brainstorm` approval (spec.md exists and is not a placeholder) - Backlog runs with `--backlog --source <github|ado|local>` to absorb GitHub issues, Azure Boards items, or markdown task lists into the DAG (D-127-12; replaces the legacy `/ai-run` skill) - When manual `/ai-build` would overflow context within a single session - Multi-concern work that benefits from parallel intelligence gathering and DAG-driven execution ## When NOT to Use - < 3 concerns -- use `/ai-build` directly (autopilot overhead not justified) - Draft or unapproved spec -- run `/ai-brainstorm` first - Need human review between phases -- use `/ai-build` with manual checkpoints - Cross-repo changes -- coordinate manually - Data migrations with destructive DDL -- require explicit user approval per step ## Process **Step 0 — Validate**: confirm `.ai-engineering/specs/spec.md` is not a placeholder (else STOP and report `/ai-brainstorm`). On `--resume`, read `.ai-engineering/runtime/autopilot/manifest.md` and re-enter at the Resume Protocol. Load stack contexts (manifest `providers.stacks` + `.ai-engineering/overrides/<stack>/conventions.md`) and pass paths (not content) to subagents. plan.md is not required — Phase 2 agents generate their own. `ai-eng host probe` is diagnostic/advisory only; `ok_to_dispatch` is not a standard-flow execution gate and cannot block `/ai-autopilot`. **Step 1 — DECOMPOSE** (`handlers/phase-decompose.md`): extract N independent concerns; abort if N<3 (recommend `/ai-build`); write sub-spec dirs and the execution manifest. **Step 2 — DEEP PLAN** (`handlers/phase-deep-plan.md`): dispatch explore+plan agents in parallel; each enriches `sub-NNN/spec.md` (Exploration) + `plan.md` (checkbox tasks with exports/imports). Failed agents retry once → mark `plan-failed`. **Step 3 — ORCHESTRATE** (`handlers/phase-orchestrate.md`): build the file-overlap matrix and import-chain graph from Phase-2 evidence (never from spec text alone); construct the wave-assigned DAG; merge unresolvable conflicts. **Step 4 — IMPLEMENT** (`handlers/phase-implement.md`): per-wave kernel from `.github/skills/_shared/execution-kernel.md`. Dispatch build agents per sub-spec in parallel within a wave; each task self-validates via TDD; wave-end guard advisory remains for governance. Collect Self-Reports + per-wave commits. Cascade-block dependents of failed sub-specs. **Step 5 — QUALITY LOOP** (`handlers/phase-quality.md`): read ai-verify / ai-review / ai-governance SKILL.md once at loop entry; dispatch verify+guard+review in parallel on the full changeset; consolidate findings (unified severity). Clean → Phase 6. Blocker/critical/high findings enter Phase 5b only if the one bounded quality-remediation pass is unused and the fixes are finding-scoped. Remaining blocker/critical/high findings after final reassessment → STOP + escalate to user. **Phase 5b — BOUNDED REMEDIATION** (`handlers/phase-quality.md`): persist `quality_remediation.max_attempts: 1` in the autopilot manifest, map each finding to `sub-NNN`, `integration`, or `shared`, run cross-platform focal reproducers, and then return to Step 5 final reassessment. No re-decompose, no re-plan, no second remediation pass. **Step 6 — DELIVER** (`handlers/phase-deliver.md`): build the Integrity Report; follow `/ai-pr` SKILL.md; cleanup runtime dir; clear spec.md + plan.md; verify cleanup. `--resume` handles mid-pipeline re-entry. ## Flags | Flag | Behavior | | ------------ | ------------------------------------------------------------------------------------------------------------------------------------- | | `--resume` | Read `.ai-engineering/runtime/autopilot/manifest.md`, determine pipeline state, re-enter at the correct phase/wave. Never re-executes completed phases. | | `--no-watch` | Create PR without the watch-and-fix loop. Useful for draft delivery or when CI is managed externally. | | `--backlog --source <github|ado|local>` | Backlog mode (D-127-12, absorbs the legacy `/ai-run` skill). Normalizes work items from the named source into the run model: `github` (Issues + GitHub Projects v2), `ado` (Azure Boards work items), `local` (a markdown task list path). The 6-phase pipeline still runs; Phase 1 DECOMPOSE is replaced by intake + per-item planning, Phase 4 IMPLEMENT dispatches one bounded build per item. | Thin orchestrator: phases READ other skills' SKILL.md and EMBED instructions into subagent prompts (no inline implementation). When those skills improve, autopilot inherits the improvement. ## Dispatch threshold Dispatch the `ai-autopilot` agent when the work matches the "When to Use" criteria above (≥3 concerns, ≥10 files, post-`/ai-brainstorm` approval, or any backlog with `--backlog`). For smaller scope, hand off to `/ai-build` directly. The agent file (`.github/agents/autopilot.agent.md`) is the orchestrator handle; the procedural contract lives in this SKILL.md. ## Governance DEC-023: invocation is the single approval gate; internal gates (sub-spec validation, DAG verification, quality convergence) are automatic and cannot be bypassed. The consolidation path is mandatory: sub-spec branch/worktree → wave or integration commits → final PR → protected main. State transitions live on disk in `.ai-engineering/runtime/autopilot/manifest.md`, never in agent memory. ## Examples See `references/examples.md` for the three canonical autopilot invocations (end-to-end delivery, `--resume` after interruption, `--backlog --source github` lane). Failure-recovery rows + telemetry event taxonomy live in the same reference file. Common mistakes checklist (do not run on draft specs, never cross repos, never carry context across sub-specs, never hand-edit mirrors) is documented there. ## Integration Called by: user directly post-`/ai-brainstorm` approval (or with `--backlog` for backlog runs). Reads: `_shared/execution-kernel.md`, `ai-verify/SKILL.md`, `ai-review/SKILL.md`, `ai-governance/SKILL.md`, `ai-pr/SKILL.md`, `ai-commit/SKILL.md`. Delegates to: `ai-explore`, `ai-build`, `ai-verify`, `ai-advise`, `ai-review` agents. Transitions to: `/ai-branch-cleanup`. See also: `/ai-build` (smaller scope), `/ai-board sync` (lifecycle transitions for backlog mode), `references/examples.md`. $ARGUMENTS
More from arcasilesgroup/ai-engineering
- ai-adviseProactive governance advisor — checks standards, decisions, and quality trends during development. Always advisory, NEVER blocks. Three modes: `advise` (post-edit), `gate` (pre-dispatch), `drift` (on-demand decision audit). Trigger for 'governance check', 'advise on this change', 'check for drift', 'is this aligned with active decisions', 'shift-left advisory'. Not for blocking gates — use /ai-verify. Not for narrative code review — use /ai-review.
- ai-analyze-permissionsUse when Claude Code keeps asking to approve commands you have already approved, when settings.local.json has grown large, or when you want to consolidate permission grants into wildcard patterns. Trigger for 'too many permission prompts', 'clean up permissions', 'audit my settings', 'consolidate allow rules'. Claude Code only — not available in GitHub Copilot, Antigravity, or Codex.
- ai-animationDesigns motion, transitions, and micro-interactions for UI components: spring animations, gestures, easing, staggers — taste-driven detail compounding. Trigger for 'animate this', 'add transitions', 'micro-interactions for', 'gesture design', 'swipe to dismiss', 'easing for this', 'stagger the'. Not for design systems; use /ai-design instead. Not for visual art; use /ai-visual instead. Not for testing animation code; use /ai-test instead.
- ai-boardOperates the project board (GitHub Projects v2 or Azure DevOps): discovers configuration after install (fields, state mappings, process templates) and syncs work-item state at lifecycle transitions. Trigger for 'set up the board', 'configure our ADO board', 'discover board fields', 'move this issue to in-review', 'update the board', 'mark as in progress', 'sync the work item state'. Two subcommands: `discover` (post-install configuration write) and `sync` (lifecycle state transitions). Auto-invoked via `sync` by /ai-brainstorm, /ai-build, and /ai-pr; fail-open. Not for backlog execution; use /ai-autopilot --backlog instead.
- ai-brainstormForces rigorous design interrogation BEFORE any code: explores approaches, surfaces ambiguity, gathers evidence, produces an approved spec that becomes the contract for /ai-plan. Trigger for 'lets add X', 'how should we handle Y', 'whats the best approach', 'I am thinking about', 'what should we build for'. Not for existing approved specs; use /ai-plan instead. Not for execution; use /ai-build instead.
- ai-branch-cleanupCleans branches safely: switches to the default branch, prunes merged and squash-merged branches, syncs to remote, sweeps stale specs, rotates `.ai-engineering/runtime/` per retention policy. Trigger for 'tidy up', 'tidy branches', 'sync to main', 'delete old branches', 'start fresh', 'rotate runtime'. Auto-invoked by /ai-pr after merge. Not for committing changes; use /ai-commit instead. Not for code-level dead-code removal; use /ai-simplify instead.
- ai-buildCanonical implementation gateway: reads approved plan.md, resolves stack from manifest, deterministic-routes each task to its adapter, dispatches the build agent in an isolated worktree, runs TDD self-validation per task, then a single final quality loop with one bounded quality-remediation pass on the full changeset before /ai-pr. Trigger for 'go', 'start building', 'execute the plan', 'implement it', 'lets do this', 'build the plan', 'resume', 'continue'. Not without an approved plan; run /ai-plan first. Not for multi-concern specs needing decomposition; use /ai-autopilot instead. Not for a single function or subcomponent; use /ai-code.
- ai-codeWrites production code that satisfies stack-context standards on the first pass: interface-first design, backward-compatibility checks, lightweight self-review. Trigger for 'implement this', 'write the code for', 'add X to Y', 'build this function', 'make this work'. Not for tests; use /ai-test instead. Not for debugging; use /ai-debug instead. Not for refactoring; use /ai-simplify instead. Not for executing an approved plan end-to-end; use /ai-build (the gateway).
- ai-commitRuns the governed commit pipeline: auto-branches from protected, stages selectively, formats and lints, scans for secrets, gates docs, composes a conventional message, pushes. Trigger for 'commit my changes', 'save my work', 'push this to remote', 'stage these files', 'ship it'. Not for opening a PR; use /ai-pr instead. Not for branch hygiene; use /ai-branch-cleanup instead.
- ai-constitutionInterviews the operator to produce a project-identity CONSTITUTION.md (Mission / Stakeholders / Vocabulary / Prohibitions / Compliance gates / Anti-goals / Boundaries / Escalation / Language / Lifecycle phase). Trigger for 'set up the constitution', 'define project identity', 'who is this project for', 'what does this project never do', 'amend the constitution'. Not for AI-behaviour rules — those live in CANONICAL.md / AGENTS.md. Not for spec governance; use /ai-governance instead.