memory-system

Name: memory-system
Author: joelhooks/joelclaw

$npx mdskill add joelhooks/joelclaw/memory-system

Manages the agent's memory pipeline for durable recall and reflection, used when debugging or evolving memory functions.

Helps with memory capture, retrieval tuning, and nightly maintenance tasks.
Integrates with Typesense for vector storage and system-bus operations.
Decides based on write gate states and a filter for long-term usefulness.
Presents results through operational flows and curated MEMORY.md documentation.

SKILL.md

.github/skills/memory-systemView on GitHub ↗

---
name: memory-system
displayName: Memory System
description: "Operate and extend the joelclaw agent memory system — observation pipeline, write gates, vector store, retrieval, reflection, and nightly maintenance. Use when working on memory functions, debugging recall, tuning observation quality, or evolving the memory architecture."
version: 0.1.0
author: joel
tags:
  - memory
  - system-bus
  - typesense
  - recall
  - reflection
---

# Memory System

Operational reference for the joelclaw memory pipeline (session signal → durable recall → curated MEMORY.md).

## When to use

Use this skill when working on memory capture, write gates, Typesense-backed recall, reflection/promote flows, or nightly maintenance.

## Canonical flow

```text
sessions → observe → write-gate → store → decay/rank → retrieve → inject
    ↑                                                              ↓
    └──── nightly maintenance (dedup + stale pruning + stats) ─────┘
                       ↓
               observe → reflect → propose → triage → MEMORY.md
```

**Guiding filter:** “will this fact still be true and useful next month?”

## 1) Write gate states (allow / hold / discard)

| State | Persist | Default retrieval | Rules |
|---|---|---|---|
| `allow` | yes | yes | Durable, reusable facts (constraints, architecture truths, operational fixes, explicit user rules). |
| `hold` | yes | no (unless explicitly requested) | Ambiguous/contextual signal worth keeping but not auto-injecting. |
| `discard` | no | no | Noise, tool traces, instruction artifacts, ephemeral chatter. |

Rules to enforce:
- Very short/low-information observations (e.g. <12 chars) → `discard`.
- Instruction-edit artifacts/raw tool traces (`<toolCall>`, command dumps, “replace X with Y”) → `discard`.
- Facts with ADR IDs, concrete file paths, or explicit runnable commands bias toward `allow`.
- If gate annotation parsing fails, mark `write_gate_fallback=true` and track drift (high fallback rate is a health smell).

## 2) Vector store schema (Typesense only)

**Backend:** `memory_observations` in Typesense. Qdrant is retired (per slog, 2026-02-28).

Minimum operational fields:
- Identity/source: `id`, `session_id`, `source`, `timestamp`, `updated_at`
- Content: `observation`, `observation_type`, `embedding`
- Write gate: `write_verdict`, `write_confidence`, `write_reason`, `write_gate_version`, `write_gate_fallback`
- Taxonomy: `category_id`, `category_confidence`, `category_source`, `taxonomy_version`
- Ranking/lifecycle: `merged_count`, `recall_count`, `retrieval_priority`, `stale`, `stale_tagged_at`, `last_used_at`, `superseded_by`, `supersedes`

## 3) Category taxonomy (7 categories)

Use taxonomy v1 IDs:
1. `jc:preferences`
2. `jc:rules-conventions`
3. `jc:system-architecture`
4. `jc:operations`
5. `jc:memory-system`
6. `jc:projects`
7. `jc:people-relationships`

## 4) Retrieval pipeline

1. **Query rewrite** (fast model, hard timeout, fail-open to original query).
2. **Hybrid search** (keyword + vector over `memory_observations`).
3. **Time decay** ranking:
   `final_score = raw_score × exp(-0.01 × days_since_created)`
   - stale memories get extra downweight.
4. **Cap** results before injection (protect context budget).
5. **Budget profile**:
   - `lean`: 2–3 hits, no rewrite, low-latency checks
   - `balanced`: 5–7 hits, default interactive mode
   - `deep`: 10–15 hits, complex debugging/research
   - `auto`: choose profile from query complexity/context

## 5) Reflection cycle

`observe → reflect → propose → triage (3 tiers) → promote to MEMORY.md`

Triage tiers:
- **Tier 1 auto-action:** auto-promote / auto-reject / auto-merge using deterministic rules.
- **Tier 2 LLM batch review:** batch adjudication for undecided proposals.
- **Tier 3 human review:** only ambiguous/risky proposals; then promote/edit/reject.

Goal: keep `MEMORY.md` small, durable, and high-signal.

## 6) Nightly maintenance

Run idempotent maintenance to keep recall quality high:
1. **Dedup sweep** (semantic similarity merge; maintain supersession chain).
2. **Stale pruning** (mark old never-recalled observations stale; prune very old stale records conservatively).
3. **Stats emission** (observation count, merges, stale volume, category distribution).

## 7) ADR map (read before changing architecture)

- **ADR-0021** — memory system foundation.
- **ADR-0068** — auto-triage pipeline.
- **ADR-0077** — next phase (reflection + maintenance).
- **ADR-0082** — Typesense as memory backend (Qdrant replaced).
- **ADR-0094 → ADR-0100** — proposed evolution (write gate governance, taxonomy/budgets, forward triggers, graph/dual-search roadmap).

## 8) Writing observations (mandatory at session end)

**Every session that produces a durable pattern, operational fix, or architectural insight MUST write observations before closing.**

```bash
joelclaw send "memory/observation.submitted" -d '{
  "observation": "<what was learned — concrete, reusable, future-tense useful>",
  "category": "jc:operations",
  "source": "pi-session",
  "tags": ["stripe", "payout"]
}'
```

Use one `send` call per distinct observation. Batch is fine — fire them in a loop.

### Category cheatsheet

| Category | Use for |
|---|---|
| `jc:operations` | How things work, API quirks, CLI patterns, operational fixes |
| `jc:rules-conventions` | Conventions, SOPs, team/project rules |
| `jc:system-architecture` | Topology, wiring, how components connect |
| `jc:projects` | Per-project facts, payout rates, product catalogs |
| `jc:preferences` | Joel's explicit preferences |
| `jc:people-relationships` | People, contacts, roles |
| `jc:memory-system` | Memory system itself |

### What makes a good observation

- **Concrete**: "Stripe Report Run requires explicit `payment_metadata[product]` column in `columns` param or it returns blank" — not "Stripe has metadata"
- **Reusable**: will this still be true next month?
- **Actionable**: an agent reading this cold should know what to do differently
- Not a transcript: no "the user asked me to", no raw tool output, no "I discovered that"

### What to skip

- Instruction artifacts, tool traces, ephemeral command outputs
- Facts already in skills or ADRs (skills are the durable home; observations are for recall/search)
- Anything under 12 chars — the write gate discards it

## 9) Operations commands (slog + recall)

```bash
# Record memory-system changes
slog write --action configure --tool memory-system --detail "<what changed>" --reason "<why>"

# Inspect recent operational history
slog tail --count 50

# Default recall
joelclaw recall "<query>" --budget balanced --limit 7

# Deep recall for hard debugging
joelclaw recall "<query>" --budget deep --limit 10

# Inspect memory-system category specifically
joelclaw recall "<query>" --category jc:memory-system --limit 10

# Include held memories when needed
joelclaw recall "<query>" --include-hold --raw
```

## 9) Canonical code paths

- Observe: `packages/system-bus/src/inngest/functions/observe.ts`
- Write gate: `packages/system-bus/src/memory/write-gate.ts`
- Taxonomy: `packages/system-bus/src/memory/taxonomy-v1.ts`
- Recall adapter: `packages/cli/src/capabilities/adapters/typesense-recall.ts`
- Reflect/propose: `packages/system-bus/src/inngest/functions/reflect.ts`, `promote.ts`
- Nightly maintenance: `packages/system-bus/src/inngest/functions/memory/nightly-maintenance.ts`

## 10) Non-negotiables

- Memory stores **patterns**, not transcript noise.
- No silent failure paths: emit telemetry on every transition.
- Keep retrieval bounded; never flood context windows.
- Keep `MEMORY.md` curated; do not auto-append raw observations.

More from joelhooks/joelclaw

Skill	Description
add-skill	Create new joelclaw skills with the idiomatic process — repo-canonical, symlinked, git-tracked, slogged. Triggers on 'add a skill', 'create skill', 'new skill', 'canonical skill', 'make a skill for', or any request to formalize a process or domain into a reusable skill.
adr-skill	Create and maintain Architecture Decision Records (ADRs) optimized for agentic coding workflows. Use when you need to propose, write, update, accept/reject, deprecate, or supersede an ADR; bootstrap an adr folder and index; consult existing ADRs before implementing changes; or enforce ADR conventions. This skill uses Socratic questioning to capture intent before drafting, and validates output against an agent-readiness checklist.
agent-discovery	"Optimize websites, docs, and product surfaces for agent discoverability and operator UX. Use when working on agent SEO/AEO/GEO, crawl policy, markdown or JSON projections, llms.txt, sitemap.md, AGENTS.md guidance, content negotiation, accessibility for browser agents, or any request to make a site easier for pi, OpenCode, Claude Code, ChatGPT, Perplexity, or other agent harnesses to find and use."
agent-loop	Start, monitor, and cancel durable multi-agent coding loops via Inngest. Use when the user wants to run autonomous coding workloads, execute a PRD with multiple stories, kick off an AFK coding session, have agents implement features from a plan, or manage running loops. Triggers on "start a coding loop", "run this PRD", "implement these stories", "go AFK and code this", "check loop status", "cancel the loop", "joelclaw loop", or any request for autonomous multi-story code execution.
agent-mail	>-
agent-workloads	"Compatibility alias for the canonical `workflow-rig` front door. Use when older prompts mention `agent-workloads` or when you need the legacy workload-planning guidance; for new work, load `workflow-rig` first."
clawmail	>-
cli-design	"Design and build agent-first CLIs with HATEOAS JSON responses, context-protecting output, and self-documenting command trees. Use when creating new CLI tools, adding commands to existing CLIs (joelclaw, slog), or reviewing CLI design for agent-friendliness. Triggers on 'build a CLI', 'add a command', 'CLI design', 'agent-friendly output', or any task involving command-line tool creation."
codex-prompting	"Use this skill for any request to trigger, coordinate, or craft prompts for Codex. Use when user says 'send to codex', 'use codex', 'prompt codex', 'ask codex', 'delegate to codex', 'run in codex', or asks for a Codex-first execution handoff."
content-publish	"Publish content to joelclaw.com via the Convex-first pipeline. Covers the full lifecycle: draft → review → publish → revalidate → verify. Handles secret leasing, tag conventions, content types (article, tutorial, note, essay), and verification gates. Use when: 'write article about X', 'publish article <slug>', 'draft a tutorial', 'publish this', 'push to convex', or any content publishing task."