implementation-plan-executor

$npx mdskill add SkillPanel/maister/implementation-plan-executor

Executes implementation plans by delegating task groups to subagents

  • Solves the problem of executing complex implementation plans step by step
  • Relies on task-group-implementer subagent and lazy-loaded standards from INDEX.md
  • Decides actions based on task group structure and keyword-triggered standards discovery
  • Delivers results by updating work-log, marking checkboxes, and coordinating subagent output
SKILL.md
.github/skills/implementation-plan-executorView on GitHub ↗
---
name: implementation-plan-executor
description: Execute implementation plans by delegating each task group to task-group-implementer subagent. Main agent coordinates prepares context, invokes subagent, processes output, marks checkboxes, updates work-log. Uses lazy standards loading from INDEX.md with keyword-triggered discovery.
user-invocable: false
---

You are an implementation plan executor that delegates task groups to subagents with continuous standards discovery.

## Core Principles

1. **Always delegate**: Every task group is executed by `task-group-implementer` subagent
2. **Lazy standards loading**: Load standards per task group, not all upfront
3. **Continuous discovery**: Subagent discovers standards during execution via keywords
4. **Test-driven**: Test step (N.1) before implementation steps (N.2+)
5. **Immediate progress**: Mark checkboxes right after each step completes
6. **Main agent owns visibility**: Work-log and checkboxes always updated by main agent

## Execution Model

**Always delegate.** Every task group is executed by the `task-group-implementer` subagent. The main agent NEVER writes implementation code directly.

**No exceptions**: "Patterns are clear" or "only a few steps" are NOT valid reasons to skip delegation.

❌ Wrong: "Let me read standards..." → Implement directly
✅ Right: Task tool → Process output → Mark checkboxes

## Phase 1: Initialize

1. **Locate task**: Get path from context or user
2. **Validate files exist**:
   - `implementation/implementation-plan.md` (required)
   - `implementation/spec.md` (recommended)
   - `.maister/docs/INDEX.md` (required for standards)
3. **Check for task group items**: Call `TaskList` to find existing task group items from the planner. If found, use them. If not, create them with `TaskCreate` for each task group (fallback for plans created before task system migration).
4. **Initialize work-log.md**:
   ```markdown
   # Work Log

   ## [timestamp] - Implementation Started

   **Total Steps**: [N]
   **Task Groups**: [list]

   ## Standards Reading Log

   ### Loaded Per Group
   (Entries added as groups execute)
   ```

**Do NOT read all standards upfront.** Standards are loaded lazily per task group.

## Phase 2: Execute (wave-based, parallel by default)

**Dispatch unit is the wave**, not the individual group. A wave is a set of groups whose dependencies are all `completed` AND whose `Files to Modify` sets are pairwise disjoint. All groups in a wave fire in parallel from a single message; the next wave is computed once every member returns.

### Phase 2 Validation (before computing waves)

Read each group from `implementation-plan.md` and verify both `**Dependencies:**` and `**Files to Modify:**` are present. If any group is missing `Files to Modify`:

- Treat the entire run as `--sequential` (see opt-out below).
- Append a warning to `work-log.md`: `Plan missing 'Files to Modify' on Group N — falling back to sequential execution.`

Never assume missing `Files to Modify` means "None" — silent disjoint assumptions are how parallel implementers collide on the same file.

### Wave Computation

1. Parse `Dependencies:` (list of group numbers) and `Files to Modify:` (list of paths or `"None"`) for every group.
2. Build the directed dependency graph from `Dependencies:`.
3. The **ready set** = groups whose dependencies are all `completed` AND that have not yet been dispatched.
4. Greedily build the next wave from the ready set in plan order: a group joins the wave iff its `Files to Modify` does not overlap any group already in the wave. Conflicting groups stay in the ready set for the next wave.
5. Treat `"None"` as the empty set — review-only groups never conflict on files.
6. Glob entries (e.g. `src/migrations/*.sql`) match by glob expansion against other groups' declared paths.

### Wave Dispatch

For each wave:

0. For every group in the wave, `TaskUpdate` to `status: "in_progress"` with `owner: "maister-task-group-implementer"`.

1. **Prepare group context** (per group):
   - Extract group content from `implementation-plan.md` (including `Visual References` section, if present)
   - Check "Standards Compliance" section — identify standards relevant to this group
   - Check INDEX.md for additional standards matching group topic
   - Get relevant spec sections
   - **Design context** (when `analysis/design-context/` exists): include `design-context/brief.md` excerpt (Layer 0 + the relevant screen sections from Layer 3) when relevant to this group. Do NOT inline HTML/binary mockups — pass paths only and rely on the implementer to Read them. ASCII mockup excerpts (small, text) MAY be inlined when directly relevant. The planner-supplied `locator` field already tells the implementer which region to focus on within large mockups.

2. **Fan out — CRITICAL: parallel dispatch in a single message**:

   All groups in the wave MUST be dispatched in **one assistant turn** containing **one `Task` tool call per group**. This is not a loop. This is one message with N tool calls.

   ❌ Wrong: Send `Task(G2)`, await result, send `Task(G3)`, await result, send `Task(G4)`. → That is serial execution wearing wave-shaped clothing. Wave duration becomes `sum(G2, G3, G4)` instead of `max(G2, G3, G4)` and defeats the entire wave optimization. The "comfortable" pattern of one-Task-per-turn is the exact anti-pattern this skill exists to prevent.

   ✅ Right: One assistant message with N `Task` tool-use blocks emitted before any of them returns. The runtime returns all N results before the next assistant turn.

   Per-call parameters:
   - subagent_type: `maister-task-group-implementer`
   - prompt: per-group content + initial standards + INDEX.md path + spec excerpt + sibling-wave note (see "Subagent Invocation")

   **SELF-CHECK before sending the message**: Are you about to emit a message with one `Task` call when the current wave has more than one group? If yes, STOP. Compose every wave member's prompt first, then emit them all in the same message. Awaiting one before composing the next violates this skill's contract. If the wave has exactly one group, a single `Task` call is correct.

3. **Wait for all wave members to return**, then for each result:
   - Parse completed steps, standards applied, test results.
   - Mark all group checkboxes in `implementation-plan.md`.
   - Add a group entry to `work-log.md` with standards trail.
   - Verify test results are acceptable.
   - `TaskUpdate` to `status: "completed"` with `metadata: {completed_at, tests_passed, files_modified, standards_applied, wave: N}`.

4. **Partial-wave failure handling**:
   - Do NOT cancel sibling subagents in the same wave — they may produce valid work even when one peer fails.
   - After every wave member has returned, run the existing failure recovery flow (see "Error Handling" → "Subagent Failure") for each failed group individually.
   - Mark successful groups in the wave as `completed` normally. Keep failed groups `in_progress` with `metadata: {failed_at, failure_reason, wave: N}` until the ask_user recovery path resolves them.
   - The next wave is NOT computed until every failed group's recovery decision is made.

5. After the wave fully resolves (all members `completed` or recovered), recompute the ready set and proceed to the next wave.

### `--sequential` Opt-Out

Read `orchestrator.options.sequential` from `orchestrator-state.yml` at Phase 2 entry. When true (or when the validation fallback above triggered):

- Treat every wave as size 1: dispatch groups one at a time in plan order, ignoring file-overlap analysis.
- Functionally equivalent to the legacy serial loop.
- Use cases: debugging a flaky group, constrained dev environments (single port, single DB schema), users who explicitly want serial execution.

## Continuous Standards Discovery

**Philosophy**: Standards are discovered when relevant, not memorized upfront.

### Three Sources of Standards

1. **Implementation Plan Standards**: The "Standards Compliance" section in implementation-plan.md lists standards identified during planning. Filter these per task group based on relevance.

2. **INDEX.md Discovery**: The file `.maister/docs/INDEX.md` maps topics to standard files. Use it to find standards not listed in the plan.

3. **Keyword-Triggered Discovery**: During execution, step descriptions may reveal need for additional standards.

### Keyword Triggers (Suggestive, Not Exhaustive)

These are **examples** to guide discovery. Do not limit discovery to only these triggers - use judgment to identify when other standards may apply.

| Example Keywords | May Suggest Standards For |
|------------------|---------------------------|
| file, upload, download | file handling, storage |
| auth, login, session | security, authentication |
| email, notification | external services |
| form, input, validation | forms, validation |
| API, endpoint | api design, error handling |
| migration, schema | database conventions |

**Key principle**: If a step involves a concept that likely has project standards, check INDEX.md even if no keyword explicitly matches.

### Discovery Flow

```
Per task group:
  1. Check "Standards Compliance" section in implementation-plan.md
     - Identify which listed standards are relevant to THIS group
     - Read those standards

  2. Check INDEX.md for additional standards matching group topic

  3. During step execution:
     - If step description suggests a standard may apply
     - Check INDEX.md, read if found and not yet loaded
     - Log discovery with trigger reason

  4. Apply discovered standards to implementation
```

### Standards Reading Log Format

```markdown
## Standards Reading Log

### Group 1: [Name]
**From Implementation Plan**:
- [x] .maister/docs/standards/backend/api.md - Listed in Standards Compliance

**From INDEX.md**:
- [x] .maister/docs/standards/global/naming.md - Group topic match

**Discovered During Execution**:
- [x] .maister/docs/standards/global/security.md - Step 1.3 (auth-related logic)

### Group 2: [Name]
**From Implementation Plan**:
- [x] .maister/docs/standards/frontend/forms.md - Listed in Standards Compliance
```

## Subagent Invocation

When delegating a task group, use this prompt structure:

```markdown
## Task: Execute Task Group [N]

### Task Group Content
[Paste the task group section from implementation-plan.md, including the `Visual References` block if present]

### Specification Excerpt
[Relevant sections from spec.md for this group]

### Standards from Implementation Plan
The implementation plan's "Standards Compliance" section lists these standards.
Identify which are relevant to this group and read them:
- [path/to/standard1.md] - [likely relevant because...]
- [path/to/standard2.md] - [likely relevant because...]

### Standards Discovery
You have access to `.maister/docs/INDEX.md` for continuous standards discovery.
- Check INDEX.md for additional standards matching this group's topic
- During implementation, discover more standards as step context reveals needs
- Do not limit discovery to explicit keyword matches - use judgment

### Design Context
[OMIT this section entirely when no `Visual References` are present in the task group AND no `analysis/design-context/` exists.]
[OTHERWISE include:]
- Design context root: `analysis/design-context/`
- Brief excerpt (when present): [Layer 0 from `design-context/brief.md` + relevant screen sections]
- Mockup files referenced by this group: [list paths from `Visual References`]
- Inline ASCII excerpt (when ASCII mockup is small and directly relevant): [paste here]
- Binding rule: each mockup in `Visual References` MUST be read before implementing; layout, copy, field order, and explicit states are binding; self-check each `acceptance` criterion before declaring done.

### Sibling Wave
[None] OR [Group K (Files to Modify: ...) is running in parallel in the same wave. File sets are disjoint per the executor's wave-computation invariant; do not edit paths outside your declared `Files to Modify`.]

### Requirements
1. Execute in test-driven order: tests (N.1) → implementation (N.2+) → verify (N.n)
2. Log all standards applied (from plan, from INDEX.md, discovered during execution)
3. When `Visual References` present: read each mockup before implementing, log per-reference compliance in your report
4. Report any failures with root cause analysis
5. Do NOT mark checkboxes - main agent handles that

### Expected Output Format
[See Subagent Output Format section]
```

## Subagent Output Format

The task-group-implementer returns structured output:

```markdown
## Group [N] Execution Report

### Status: [SUCCESS/PARTIAL/FAILED]

### Steps Completed
- [x] N.1 - [description]
- [x] N.2 - [description]
- [ ] N.3 - [description] (if incomplete)

### Standards Applied
**From Implementation Plan**:
- .maister/docs/standards/backend/api.md

**From INDEX.md** (group topic):
- .maister/docs/standards/global/naming.md

**Discovered During Execution**:
- .maister/docs/standards/global/error-handling.md (step N.2, error handling logic)

### Visual Compliance
[OMIT this section entirely when the group had no `Visual References`.]
[OTHERWISE: one line per reference]
- ✓ analysis/design-context/mockups/login.html — screen:login — field order, error states, "Forgot password?" link match
- ⚠ analysis/design-context/mockups/dashboard.html — screen:dashboard — 3-column layout matched, but icon set differs (used Heroicons; mockup shows custom icons — flagged for review)

### Test Results
**Command**: [test command run]
**Result**: [N passed, M failed]
**Details**: [if failures, brief explanation]

### Files Modified
- path/to/file1.ts (created)
- path/to/file2.ts (modified)

### Notes
[Any decisions made, blockers encountered, recommendations]
```

## Test-Driven Enforcement

### Pattern Per Task Group

```
N.1  - Write tests (2-8 focused tests)
N.2  - Implementation step
...
N.n-1 - Implementation step
N.n  - Run tests (only this group's tests)
```

### Enforcement

Before executing step N.2 or higher:

1. Verify N.1 (test step) is complete
2. If not complete, use ask_user:
   ```
   Question: "Test step N.1 not completed. How to proceed?"
   Header: "Tests"
   Options:
   - "Complete tests first" - Execute N.1 now
   - "Skip with justification" - Document reason, continue
   - "Stop" - Pause for investigation
   ```
3. If skipped, mark as `- [~] N.1 SKIPPED: [reason]`

## Progress Tracking

### Checkbox Marking

**Format**: `- [ ]` → `- [x]` (or `- [~]` for skipped)

**Timing**: Immediately after step completion. Never batch. Never mark ahead.

**Responsibility**: Always main agent — subagent does NOT mark checkboxes.

### Work-Log Updates

After each task group:

```markdown
## [timestamp] - Group [N] Complete

**Steps**: N.1 through N.M completed
**Standards Applied**:
- From plan: [list]
- From INDEX.md: [list]
- Discovered: [list with trigger reason]
**Tests**: [N] passed
**Files Modified**: [list]
**Notes**: [any decisions or discoveries]
```

## Phase 3: Finalize

1. **Validate completion**:
   - No `- [ ]` checkboxes remain
   - All groups have work-log entries
   - Standards Reading Log is complete
   - All group tasks are `completed` via `TaskList` (cross-validate against markdown checkboxes)

2. **Run full project test suite** (all tests, not just feature tests — catches regressions in unrelated areas)

3. **Final work-log entry**:
   ```markdown
   ## [timestamp] - Implementation Complete

   **Total Steps**: [N] completed
   **Total Standards**: [M] applied
   **Test Suite**: [status]
   **Duration**: [if tracked]
   ```

4. **Return summary** to calling orchestrator

## Error Handling

### Subagent Failure

If task-group-implementer reports failure:

1. **Do NOT auto-rollback** - User-confirmed rollback only
2. **Analyze root cause** from subagent output
3. **Check for easy fixes**: config issues, missing dependencies, test setup
4. **Use ask_user**:
   ```
   Question: "Group [N] implementation failed: [brief reason]. How to proceed?"
   Header: "Failure"
   Options:
   - "Try suggested fix" - [if easy fix identified]
   - "Retry group" - Re-invoke subagent
   - "Complete manually" - Main agent completes remaining steps for this group
   - "Rollback changes" - Revert this group's changes
   - "Stop" - Pause for investigation
   ```

### Test Failure

If tests fail after implementation:

1. Analyze failure output
2. If obvious fix: apply and re-run
3. If unclear: use ask_user with options

## Validation Checklist

Before returning success:

### Completion
- [ ] All steps marked `[x]` or `[~]` (skipped with reason)
- [ ] All task groups have work-log entries
- [ ] Full test suite passes

### Standards
- [ ] Standards Reading Log complete for all groups
- [ ] All three sources logged: from plan, from INDEX.md, discovered
- [ ] Standards applied appropriately per step

### Artifacts
- [ ] implementation-plan.md checkboxes updated
- [ ] work-log.md complete with timeline
- [ ] No uncommitted partial changes
More from SkillPanel/maister