dig

$npx mdskill add Soul-Brews-Studio/arra-oracle-skills-cli/dig

Mine past session data for timelines and gaps.

  • Recover historical work patterns and project attribution.
  • Scans all JSONL files across every project directory.
  • Filters results by count, timeline grouping, or depth.
  • Outputs structured session metadata with tool counts.
SKILL.md
.github/skills/digView on GitHub ↗
---
name: dig
description: Mine Claude Code sessions — timeline, gaps, repo attribution, session history. Use when user says "dig", "sessions", "past sessions", "timeline", "what did I work on", or wants to see session history. Do NOT trigger for finding code/projects (use /trace), exploring repos (use /learn), or current session status (use /recap).
argument-hint: "[N] | --all | --timeline | --deep"
---

# /dig - Session Goldminer v2

Mine Claude Code session data for timelines, gaps, and repo attribution. No query needed.

## Usage

```
/dig                    # Current repo, 10 most recent
/dig [N]                # Current repo, N most recent
/dig --all              # All repos, ALL sessions (auto-detect count)
/dig --all [N]          # All repos, N most recent
/dig --deep             # Deep scan — ALL .jsonl files (not just most-recent per project)
/dig --deep [N]         # Deep scan, N most recent
/dig --all --deep       # All repos, deep scan — full coverage
/dig --timeline         # Day-by-day grouped (current repo)
/dig --all --timeline   # Day-by-day grouped (all repos, ALL sessions)
```

### Deep Mode (#216)

Standard `/dig` deduplicates by session basename — only showing the most recent file per session. This misses historical sessions, subagent sessions, and worktree sessions.

`--deep` scans ALL `.jsonl` files across all project directories. Output includes extra fields:
- `toolCalls` — number of tool invocations in the session
- `fileSizeKB` — session file size in KB
- Coverage metadata: `"X of Y sessions"` showing how many were returned vs found

### Timezone (#214)

dig.py v2 auto-detects timezone instead of hardcoding GMT+7:
1. `MAW_DISPLAY_TZ` env var (e.g., `Asia/Bangkok`, `7`, `America/New_York`)
2. `TZ` env var
3. System timezone
4. Fallback: UTC

Output metadata includes detected timezone name and offset.

## Step 0: Timestamp

```bash
date "+🕐 %H:%M %Z (%A %d %B %Y)" && ENCODED_PWD=$(pwd | sed 's|^/|-|; s|[/.]|-|g')
PROJECT_BASE=$(ls -d "$HOME/.claude/projects/${ENCODED_PWD}" 2>/dev/null | head -1)
export PROJECT_DIRS="$PROJECT_BASE"
for wt in "$HOME/.claude/projects/${ENCODED_PWD}"-wt*; do [ -d "$wt" ] && export PROJECT_DIRS="$PROJECT_DIRS:$wt"; done
```

Encodes `pwd` the same way Claude does (replace `/` and `.` with `-`, prepend `-`) to match the `.claude/projects/` directory naming (e.g. `github.com` → `github-com`). Also picks up worktree dirs (`-wt`, `-wt-1`, etc.).

**With `--all`** (all repos):
```bash
export PROJECT_DIRS=$(ls -d "$HOME/.claude/projects/"*/ | tr '\n' ':')
```

## Step 2: Extract Session Data

Run the dig script. Pass `0` for `--all` (no limit), or N if user specified a count, default 10:

```bash
python3 ~/.claude/skills/dig/scripts/dig.py [N] [--deep]
# N=10 (default), N=0 (scan all sessions), N=50 (50 most recent)
# --deep: scan ALL .jsonl files (not just most-recent per basename)
```

### Deep Mode Display

When `--deep` is used, add extra columns to the table:

```markdown
| # | Date | Session | Min | Repo | Msgs | Tools | Size | Focus |
```

And show coverage in the footer:

```markdown
**Coverage**: [returned]/[found] sessions | **Timezone**: [tz_name] (GMT+[offset])
```

## Step 3: Display Timeline

Read the JSON output and display as a table. Sessions are chronological (oldest first). Gap rows (`type: "gap"`) span the session column with `· · ·` prefix:

```markdown
## Session Timeline

| # | Date | Session | Min | Repo | Msgs | Focus |
|---|------|---------|-----|------|------|-------|
|   |      | · · · sleeping / offline | | | | |
| 1 | 02-21 | 08:40–09:08 | 28m | oracle-skills-cli | 5 | Wire /rrr to read pulse data |
|   |      | · · · 45m gap | | | | |
| 2 | 02-21 | 09:55–10:23 | 28m | homelab | 3 | oracle-pulse birth + CLI flag |
|   |      | · · · no session yet | | | | |

**Dirs scanned**: [list PROJECT_DIRS]
**Total sessions found**: [count]
```

Column rendering rules:
- **Gap rows**: `|   |      | · · · [label] | | | | |` — number + date empty, label in Session col
- **Date**: `MM-DD` short format (strip year)
- **Session**: `HH:MM–HH:MM` using `startGMT7` and `endGMT7` (strip date, keep time only)
- **Min**: `[durationMin]m`
- **Repo**: use `repoName` field from dig.py output (resolved via ghq)
- **Msgs**: `realHumanMessages` count

"Msgs" = real typed human messages (not tool approvals).

---

## With --timeline: Group by Date

When `--timeline` flag is present, group sessions by date instead of a flat table. Use `--all` to see all repos (recommended for timeline).

**Step 1**: Run dig.py with `0` for `--all` (scans all sessions), or user-specified count

**Step 2**: Group sessions by date from `startGMT7`. Render each day as:

```markdown
## Feb 22 (Sun) — [vibe label]

                  · · ·   sleeping / offline
08:48–09:11    23m   homelab        Update Fleet Runbook + Explore black.local
09:11–11:30   139m   homelab        Set Up KVM OpenClaw Node on black.local
09:37–12:51   194m   Nat-s-Agents   /recap → supergateway → CF ZT → arra-oracle-v3 dig
                  · · ·   45m gap
12:51–13:03    12m   Nat-s-Agents   Dig All + Design arra-oracle-v3 ← current
                  · · ·   no session yet

## Feb 21 (Sat) — Long day: Fleet + Brewing + Skills

06:19–08:38   139m   homelab        Moltworker Gateway + MBP Node
08:40           (bg)  openclaw       ClawHub Build Script (idle long)
09:23–16:08   405m   homelab        Debug MBP Node 401 — Gateway Token Auth
```

**Rendering rules**:
- **Day header**: `## MMM DD (Day) — [vibe label]` — infer vibe from session summaries (e.g. "Infrastructure Day", "Brewing + Skills")
- **Session rows**: `HH:MM–HH:MM  [N]m  REPO  Summary` — use `repoName` for repo, `summary` for focus
- **Gap rows**: `· · ·  [label]` between sessions when gap > 30 min
- **Sidechain**: prefix `(bg)` for sessions with `isSidechain: true`
- **Current**: append `← current` marker on the last session of the current day (today only)
- **Sort**: days newest-first, sessions within each day oldest-first (chronological)
- **Date format**: `startGMT7` time portion only (HH:MM), `endGMT7` time portion (HH:MM)
- **Repo width**: pad repo names to align columns

**Step 3**: Show summary footer:
```markdown
**Days**: [count] | **Sessions**: [count] | **Total time**: [sum of durationMin]m
```

---

## Trace Connection

After displaying the timeline, log the discovery to Oracle so it's searchable later:

```
arra_trace({
  query: "dig session [N]",
  project: "[repo-name]",
  foundFiles: [],
  foundCommits: [list of commit hashes from timeline],
  foundIssues: []
})
```

This connects `/dig` discoveries to `/trace` — "When did we first work on X?" becomes answerable.

---

ARGUMENTS: $ARGUMENTS
More from Soul-Brews-Studio/arra-oracle-skills-cli