checkpoint-resume

Name: checkpoint-resume
Author: yonatangross/orchestkit

$npx mdskill add yonatangross/orchestkit/checkpoint-resume

Orchestrates long multi-phase tasks with checkpointing to survive rate limits and interruptions.

Helps manage complex workflows involving commits, GitHub issues, and large file changes.
Integrates with Bash, file operations, and task management tools for execution.
Follows strict rules for phase ordering, state writes, and mini-commits to ensure resilience.
Saves progress to a JSON file and displays status via scripts for easy resumption.

SKILL.md

.github/skills/checkpoint-resumeView on GitHub ↗

---
name: checkpoint-resume
description: Rate-limit-resilient pipeline with checkpoint/resume for long multi-phase sessions. Saves progress to .claude/pipeline-state.json after each phase. Use when starting a complex multi-phase task that risks hitting rate limits, when resuming an interrupted session, or when orchestrating work spanning commits, GitHub issues, and large file changes.
tags: [resilience, checkpoint, pipeline, orchestkit]
version: 2.0.0
author: OrchestKit
user-invocable: false
disable-model-invocation: true
context: fork
complexity: high
persuasion-type: guidance
effort: high
allowed-tools: [Bash, Read, Write, Edit, Grep, Glob, TaskCreate, TaskUpdate, TaskList, TaskOutput]
---

# Checkpoint Resume

Rate-limit-resilient pipeline orchestrator. Saves progress to `.claude/pipeline-state.json` after every phase so long sessions survive interruptions.

## Quick Reference

| Category | Rule | Impact | Key Pattern |
|----------|------|--------|-------------|
| Phase Ordering | `${CLAUDE_SKILL_DIR}/rules/ordering-priority.md` | CRITICAL | GitHub issues/commits first, file-heavy phases last |
| State Writes | `${CLAUDE_SKILL_DIR}/rules/state-write-timing.md` | CRITICAL | Write after every phase, never batch |
| Mini-Commits | `${CLAUDE_SKILL_DIR}/rules/checkpoint-mini-commit.md` | HIGH | Every 3 phases, checkpoint commit format |

**Total: 3 rules across 3 categories**

## On Invocation

**If `.claude/pipeline-state.json` exists:** run `scripts/show-status.sh` to display progress, then ask to resume, pick a different phase, or restart. Load `Read("${CLAUDE_SKILL_DIR}/references/resume-decision-tree.md")` for the full decision tree.

**If no state file exists:** ask the user to describe the task, build an execution plan, write initial state via `scripts/init-pipeline.sh <branch>`, begin Phase 1.

## Execution Plan Structure

```json
{
  "phases": [
    { "id": "create-issues", "name": "Create GitHub Issues", "dependencies": [], "status": "pending" },
    { "id": "commit-scaffold", "name": "Commit Scaffold", "dependencies": [], "status": "pending" },
    { "id": "write-source", "name": "Write Source Files", "dependencies": ["commit-scaffold"], "status": "pending" }
  ]
}
```

Phases with empty `dependencies` may run in parallel via Task sub-agents (when they don't share file writes).

## After Each Phase

1. Update `.claude/pipeline-state.json` — see `Read("${CLAUDE_SKILL_DIR}/rules/state-write-timing.md")`
2. Every 3 phases: create a mini-commit — see `Read("${CLAUDE_SKILL_DIR}/rules/checkpoint-mini-commit.md")`

## References

Load on demand with `Read("${CLAUDE_SKILL_DIR}/references/<file>")`:
| File | Content |
|------|---------|
| `references/pipeline-state-schema.md` | Full field-by-field schema with examples |
| `references/pipeline-state.schema.json` | Machine-readable JSON Schema for validation |
| `references/resume-decision-tree.md` | Logic for resuming, picking phases, or restarting |

## Scripts

- `scripts/init-pipeline.sh <branch>` — print skeleton state JSON to stdout
- `scripts/show-status.sh [path]` — print human-readable pipeline status (requires `jq`)

## Key Decisions

| Decision | Recommendation |
|----------|----------------|
| Phase granularity | One meaningful deliverable per phase (a commit, a set of issues, a feature) |
| Parallelism | Task sub-agents only for phases with empty `dependencies` that don't share file writes |
| Rate limit recovery | State is already saved — re-invoke `/checkpoint-resume` to continue |