debugging-workflows

Name: debugging-workflows
Author: github/gh-aw
$npx mdskill add github/gh-aw/debugging-workflows
Diagnose GitHub Agentic Workflow failures using logs and audits.
Analyzes run artifacts and failure patterns to resolve workflow issues.
Integrates with GitHub Actions and the gh CLI for log retrieval.
Selects relevant data based on user-specified timeframes or run IDs.
Outputs structured logs and audit reports for programmatic or manual review.
SKILL.md
.github/skills/debugging-workflowsView on GitHub ↗
---
name: debugging-workflows
description: Debug gh-aw workflows using run logs, audits, and failure triage.
---


# Debugging GitHub Agentic Workflows

Use this guide to debug GitHub Agentic Workflows: download and analyze logs, audit runs, and trace workflow behavior.

## Table of Contents

- [Quick Start](#quick-start)
- [Downloading Workflow Logs](#downloading-workflow-logs)
- [Auditing Specific Runs](#auditing-specific-runs)
- [How Agentic Workflows Work](#how-agentic-workflows-work)
- [Common Issues and Solutions](#common-issues-and-solutions)
- [Advanced Debugging Techniques](#advanced-debugging-techniques)
- [Reference Commands](#reference-commands)

## Quick Start

### Download Logs from Recent Runs

```bash
# Download logs from the last 24 hours
gh aw logs --start-date -1d -o /tmp/workflow-logs

# Download logs for a specific workflow
gh aw logs weekly-research --start-date -1d

# Download logs with JSON output for programmatic analysis
gh aw logs --json
```

### Audit a Specific Run

```bash
# Audit by run ID
gh aw audit 1234567890

# Audit from a GitHub Actions URL
gh aw audit https://github.com/owner/repo/actions/runs/1234567890

# Audit with JSON output
gh aw audit 1234567890 --json
```

## Downloading Workflow Logs

The `gh aw logs` command downloads workflow run artifacts and logs from GitHub Actions for analysis.

### Basic Usage

```bash
# Download logs for all workflows (last 10 runs)
gh aw logs

# Download logs for a specific workflow
gh aw logs <workflow-name>

# Download with custom output directory
gh aw logs -o ./my-logs
```

### Filter Options

```bash
# Filter by date range
gh aw logs --start-date 2024-01-01 --end-date 2024-01-31
gh aw logs --start-date -1w                    # Last week
gh aw logs --start-date -1mo                   # Last month

# Filter by AI engine
gh aw logs --engine copilot
gh aw logs --engine claude
gh aw logs --engine codex

# Filter by count
gh aw logs -c 5                                # Last 5 runs

# Filter by branch/tag
gh aw logs --ref main
gh aw logs --ref feature-xyz

# Filter by run ID range
gh aw logs --after-run-id 1000 --before-run-id 2000

# Filter firewall-enabled runs
gh aw logs --firewall                          # Only firewall-enabled
gh aw logs --no-firewall                       # Only non-firewall
```

### Output Options

```bash
# Generate JSON summary
gh aw logs --json

# Parse agent logs and generate Markdown reports
gh aw logs --parse

# Generate Mermaid tool sequence graph
gh aw logs --tool-graph

# Set download timeout
gh aw logs --timeout 300                       # 5 minute timeout
```

### Downloaded Artifacts

When you run `gh aw logs`, the following artifacts are downloaded for each run:

| File | Description |
|------|-------------|
| `aw_info.json` | Engine configuration and workflow metadata |
| `safe_output.jsonl` | Agent's final output content (when non-empty) |
| `agent_output/` | Agent logs directory |
| `agent-stdio.log` | Agent standard output/error logs |
| `aw.patch` | Git patch of changes made during execution |
| `workflow-logs/` | GitHub Actions job logs (organized by job) |
| `summary.json` | Complete metrics and run data for all runs |

### Example: Analyze Recent Failures

```bash
# Download failed runs from last week
gh aw logs --start-date -1w -o /tmp/debug-logs

# Check the summary for patterns
cat /tmp/debug-logs/summary.json | jq '.runs[] | select(.conclusion == "failure")'
```

## Auditing Specific Runs

The `gh aw audit` command investigates a single workflow run in detail, downloading artifacts, detecting errors, and generating a report.

### Basic Usage

```bash
# Audit by numeric run ID
gh aw audit 1234567890

# Audit from GitHub Actions URL
gh aw audit https://github.com/owner/repo/actions/runs/1234567890

# Audit from job URL (extracts first failing step)
gh aw audit https://github.com/owner/repo/actions/runs/1234567890/job/9876543210

# Audit from job URL with specific step
gh aw audit https://github.com/owner/repo/actions/runs/1234567890/job/9876543210#step:7:1
```

### Output Options

```bash
# JSON output for programmatic analysis
gh aw audit 1234567890 --json

# Custom output directory
gh aw audit 1234567890 -o ./audit-reports

# Parse agent logs and firewall logs
gh aw audit 1234567890 --parse

# Verbose output
gh aw audit 1234567890 -v
```

### Audit Report Contents

The audit command provides:

- **Error Detection**: Errors and warnings from workflow logs
- **MCP Tool Usage**: Statistics on tool calls by the AI agent
- **Missing Tools**: Tools the agent tried to use but weren't available
- **Execution Metrics**: Duration, token usage, and cost information
- **Safe Output Analysis**: What GitHub operations were attempted

### Example: Investigate a Failed Run

```bash
# Get detailed audit report
gh aw audit 1234567890 --json > audit.json

# Extract key information
cat audit.json | jq '{
  status: .status,
  conclusion: .conclusion,
  errors: .errors,
  missing_tools: .missing_tools,
  tool_usage: .tool_usage
}'
```

## How Agentic Workflows Work

Understanding the workflow architecture helps in debugging.

### Workflow Structure

Agentic workflows use a **markdown + YAML frontmatter** format:

```markdown
---
on:
  issues:
    types: [opened]
permissions:
  issues: write
timeout-minutes: 10
engine: copilot
tools:
  github:
    mode: remote
    toolsets: [default]
safe-outputs:
  create-issue:
    labels: [ai-generated]
---

# Workflow Title

Natural language instructions for the AI agent.

Use GitHub context like ${{ github.event.issue.number }}.
```

### Execution Flow

```
1. Trigger Event (issue opened, PR created, schedule, etc.)
     ↓
2. Activation Job
   - Validates permissions
   - Processes mcp-scripts
   - Sanitizes context
     ↓
3. AI Agent Job
   - Loads MCP servers and tools
   - Executes AI agent with prompt
   - Agent makes tool calls
   - Agent produces output
     ↓
4. Safe Outputs Job
   - Processes agent output
   - Creates GitHub resources (issues, PRs, etc.)
   - Applies labels, comments
     ↓
5. Completion
   - Workflow summary generated
   - Artifacts uploaded
```

### Key Components

| Component | Purpose | Configuration |
|-----------|---------|---------------|
| **Engine** | AI model to use | `engine: copilot`, `claude`, `codex` |
| **Tools** | APIs available to agent | `tools:` section with MCP servers |
| **MCP Scripts** | Context passed to agent | `mcp-scripts:` with GitHub expressions |
| **Safe-Outputs** | Resources agent can create | `safe-outputs:` with allowed operations |
| **Permissions** | GitHub token permissions | `permissions:` block |
| **Network** | Allowed network access | `network:` with domain/ecosystem lists |

### Compilation Process

```bash
# Compile workflow to GitHub Actions YAML
gh aw compile <workflow-name>

# Result: .github/workflows/<name>.md → .github/workflows/<name>.lock.yml
```

The `.lock.yml` file is the actual GitHub Actions workflow that runs.

## Common Issues and Solutions

### Missing Tool Errors

**Symptoms**:
- Error: "Tool 'github:read_issue' not found"
- Agent cannot access GitHub APIs

**Solution**: Add GitHub MCP server configuration:

```yaml
tools:
  github:
    mode: remote
    toolsets: [default]
```

### Permission Errors

**Symptoms**:
- HTTP 403 (Forbidden) errors
- "Resource not accessible" errors

**Solution**: Add required permissions:

```yaml
permissions:
  contents: read
  issues: write
  pull-requests: write
```

### Safe-Input Errors

**Symptoms**:
- "missing tool configuration for mcpscripts-gh"
- Environment variable not available

**Solution**: Configure mcp-scripts:

```yaml
mcp-scripts:
  issue:
    script: |
      return { title: process.env.ISSUE_TITLE, body: process.env.ISSUE_BODY };
    env:
      ISSUE_TITLE: ${{ github.event.issue.title }}
      ISSUE_BODY: ${{ github.event.issue.body }}
```

### Safe-Output Errors

**Symptoms**:
- Agent tries to create resources but fails
- "Safe output not enabled" errors

**Solution**: Enable safe-outputs:

```yaml
safe-outputs:
  staged: false  # Set to false to actually create resources
  create-issue:
    labels: [ai-generated]
```

### Network Access Errors

**Symptoms**:
- Firewall denials
- URLs appearing as "(redacted)"

**Solution**: Configure network access:

```yaml
network:
  allowed:
    - defaults
    - python    # For PyPI
    - node      # For npm
    - "api.example.com"  # Custom domains
```

### Timeout Errors

**Symptoms**:
- Workflow exceeds time limit
- Agent loops or hangs

**Solution**: Increase timeout or optimize prompt:

```yaml
timeout-minutes: 30  # Increase from default
```

## Advanced Debugging Techniques

### Polling In-Progress Runs

When a run is still executing:

```bash
# Poll until completion
while true; do
  output=$(gh aw audit <run-id> --json 2>&1)
  if echo "$output" | grep -q '"status":.*"\(completed\|failure\|cancelled\)"'; then
    echo "$output"
    break
  fi
  echo "⏳ Run still in progress. Waiting 45 seconds..."
  sleep 45
done
```

### Inspecting MCP Configuration

```bash
# Inspect MCP servers for a workflow
gh aw mcp inspect <workflow-name>

# List all workflows with MCP servers
gh aw mcp list
```

### Checking Workflow Status

```bash
# Show status of all agentic workflows
gh aw status
```

### Downloading Specific Artifacts

```bash
# Download only the agent log artifact
GH_REPO=owner/repo gh run download <run-id> -n agent-stdio.log
```

### Inspecting Job Logs

```bash
# View specific job logs
gh run view <run-id>
gh run view --job <job-id> --log
```

### Analyzing Firewall Logs

```bash
# Parse firewall logs for network issues
gh aw logs --parse

# Check firewall-enabled runs
gh aw logs --firewall
```

### Debug Mode Compilation

```bash
# Compile with verbose output
gh aw compile --verbose

# Compile with strict security checks
gh aw compile --strict

# Run security scanners
gh aw compile --actionlint --zizmor --poutine
```

## Reference Commands

### Log Analysis Commands

| Command | Description |
|---------|-------------|
| `gh aw logs` | Download logs for all workflows |
| `gh aw logs <workflow>` | Download logs for specific workflow |
| `gh aw logs --json` | Output as JSON |
| `gh aw logs --start-date -1d` | Filter by date |
| `gh aw logs --engine copilot` | Filter by engine |
| `gh aw logs --parse` | Generate Markdown reports |

### Audit Commands

| Command | Description |
|---------|-------------|
| `gh aw audit <run-id>` | Audit specific run |
| `gh aw audit <url>` | Audit from GitHub URL |
| `gh aw audit <run-id> --json` | Output as JSON |
| `gh aw audit <run-id> --parse` | Parse logs to Markdown |

### MCP Commands

| Command | Description |
|---------|-------------|
| `gh aw mcp list` | List workflows with MCP servers |
| `gh aw mcp inspect <workflow>` | Inspect MCP configuration |

### Status Commands

| Command | Description |
|---------|-------------|
| `gh aw status` | Show all workflow status |
| `gh aw compile` | Compile all workflows |
| `gh aw compile <workflow>` | Compile specific workflow |
| `gh aw compile --strict` | Compile with security checks |

### Workflow Execution Commands

| Command | Description |
|---------|-------------|
| `gh aw run <workflow>` | Trigger workflow manually |
| `gh workflow run <name>.lock.yml` | Alternative trigger method |
| `gh run watch <run-id>` | Monitor running workflow |

## Additional Resources

- [Workflow Health Monitoring Runbook](../../aw/runbooks/workflow-health.md) - Step-by-step investigation procedures
- [Common Issues Reference](../../../docs/src/content/docs/troubleshooting/common-issues.md) - Frequently encountered issues
- [Error Reference](../../../docs/src/content/docs/troubleshooting/errors.md) - Error codes and solutions
- [GitHub MCP Server Documentation](../../../skills/github-mcp-server/SKILL.md) - Tool configuration reference