ce-agent-native-audit
$
npx mdskill add EveryInc/compound-engineering-plugin/ce-agent-native-auditConduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report.
SKILL.md
.github/skills/ce-agent-native-auditView on GitHub ↗
--- name: ce-agent-native-audit description: Run comprehensive agent-native architecture review with scored principles argument-hint: "[optional: specific principle to audit]" disable-model-invocation: true --- # Agent-Native Architecture Audit Conduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report. ## Core Principles to Audit 1. **Action Parity** - "Whatever the user can do, the agent can do" 2. **Tools as Primitives** - "Tools provide capability, not behavior" 3. **Context Injection** - "System prompt includes dynamic context about app state" 4. **Shared Workspace** - "Agent and user work in the same data space" 5. **CRUD Completeness** - "Every entity has full CRUD (Create, Read, Update, Delete)" 6. **UI Integration** - "Agent actions immediately reflected in UI" 7. **Capability Discovery** - "Users can discover what the agent can do" 8. **Prompt-Native Features** - "Features are prompts defining outcomes, not code" ## Workflow ### Step 1: Load the Agent-Native Skill First, invoke the agent-native-architecture skill to understand all principles: ``` /ce-agent-native-architecture ``` Select option 7 (action parity) to load the full reference material. ### Step 2: Launch Parallel Sub-Agents Launch 8 parallel sub-agents using the platform's subagent primitive (`Agent` with `subagent_type: Explore` in Claude Code, `spawn_agent` with `agent_type: "explorer"` in Codex, `subagent` with `agent: "scout"` in Pi via the `pi-subagents` extension), one for each principle. Each agent should: 1. Enumerate ALL instances in the codebase (user actions, tools, contexts, data stores, etc.) 2. Check compliance against the principle 3. Provide a SPECIFIC SCORE like "X out of Y (percentage%)" 4. List specific gaps and recommendations <sub-agents> **Agent 1: Action Parity** ``` Audit for ACTION PARITY - "Whatever the user can do, the agent can do." Tasks: 1. Enumerate ALL user actions in frontend (API calls, button clicks, form submissions) - Search for API service files, fetch calls, form handlers - Check routes and components for user interactions 2. Check which have corresponding agent tools - Search for agent tool definitions - Map user actions to agent capabilities 3. Score: "Agent can do X out of Y user actions" Format: ## Action Parity Audit ### User Actions Found | Action | Location | Agent Tool | Status | ### Score: X/Y (percentage%) ### Missing Agent Tools ### Recommendations ``` **Agent 2: Tools as Primitives** ``` Audit for TOOLS AS PRIMITIVES - "Tools provide capability, not behavior." Tasks: 1. Find and read ALL agent tool files 2. Classify each as: - PRIMITIVE (good): read, write, store, list - enables capability without business logic - WORKFLOW (bad): encodes business logic, makes decisions, orchestrates steps 3. Score: "X out of Y tools are proper primitives" Format: ## Tools as Primitives Audit ### Tool Analysis | Tool | File | Type | Reasoning | ### Score: X/Y (percentage%) ### Problematic Tools (workflows that should be primitives) ### Recommendations ``` **Agent 3: Context Injection** ``` Audit for CONTEXT INJECTION - "System prompt includes dynamic context about app state" Tasks: 1. Find context injection code (search for "context", "system prompt", "inject") 2. Read agent prompts and system messages 3. Enumerate what IS injected vs what SHOULD be: - Available resources (files, drafts, documents) - User preferences/settings - Recent activity - Available capabilities listed - Session history - Workspace state Format: ## Context Injection Audit ### Context Types Analysis | Context Type | Injected? | Location | Notes | ### Score: X/Y (percentage%) ### Missing Context ### Recommendations ``` **Agent 4: Shared Workspace** ``` Audit for SHARED WORKSPACE - "Agent and user work in the same data space" Tasks: 1. Identify all data stores/tables/models 2. Check if agents read/write to SAME tables or separate ones 3. Look for sandbox isolation anti-pattern (agent has separate data space) Format: ## Shared Workspace Audit ### Data Store Analysis | Data Store | User Access | Agent Access | Shared? | ### Score: X/Y (percentage%) ### Isolated Data (anti-pattern) ### Recommendations ``` **Agent 5: CRUD Completeness** ``` Audit for CRUD COMPLETENESS - "Every entity has full CRUD" Tasks: 1. Identify all entities/models in the codebase 2. For each entity, check if agent tools exist for: - Create - Read - Update - Delete 3. Score per entity and overall Format: ## CRUD Completeness Audit ### Entity CRUD Analysis | Entity | Create | Read | Update | Delete | Score | ### Overall Score: X/Y entities with full CRUD (percentage%) ### Incomplete Entities (list missing operations) ### Recommendations ``` **Agent 6: UI Integration** ``` Audit for UI INTEGRATION - "Agent actions immediately reflected in UI" Tasks: 1. Check how agent writes/changes propagate to frontend 2. Look for: - Streaming updates (SSE, WebSocket) - Polling mechanisms - Shared state/services - Event buses - File watching 3. Identify "silent actions" anti-pattern (agent changes state but UI doesn't update) Format: ## UI Integration Audit ### Agent Action → UI Update Analysis | Agent Action | UI Mechanism | Immediate? | Notes | ### Score: X/Y (percentage%) ### Silent Actions (anti-pattern) ### Recommendations ``` **Agent 7: Capability Discovery** ``` Audit for CAPABILITY DISCOVERY - "Users can discover what the agent can do" Tasks: 1. Check for these 7 discovery mechanisms: - Onboarding flow showing agent capabilities - Help documentation - Capability hints in UI - Agent self-describes in responses - Suggested prompts/actions - Empty state guidance - Slash commands (/help, /tools) 2. Score against 7 mechanisms Format: ## Capability Discovery Audit ### Discovery Mechanism Analysis | Mechanism | Exists? | Location | Quality | ### Score: X/7 (percentage%) ### Missing Discovery ### Recommendations ``` **Agent 8: Prompt-Native Features** ``` Audit for PROMPT-NATIVE FEATURES - "Features are prompts defining outcomes, not code" Tasks: 1. Read all agent prompts 2. Classify each feature/behavior as defined in: - PROMPT (good): outcomes defined in natural language - CODE (bad): business logic hardcoded 3. Check if behavior changes require prompt edit vs code change Format: ## Prompt-Native Features Audit ### Feature Definition Analysis | Feature | Defined In | Type | Notes | ### Score: X/Y (percentage%) ### Code-Defined Features (anti-pattern) ### Recommendations ``` </sub-agents> ### Step 3: Compile Summary Report After all agents complete, compile a summary with: ```markdown ## Agent-Native Architecture Review: [Project Name] ### Overall Score Summary | Core Principle | Score | Percentage | Status | |----------------|-------|------------|--------| | Action Parity | X/Y | Z% | ✅/⚠️/❌ | | Tools as Primitives | X/Y | Z% | ✅/⚠️/❌ | | Context Injection | X/Y | Z% | ✅/⚠️/❌ | | Shared Workspace | X/Y | Z% | ✅/⚠️/❌ | | CRUD Completeness | X/Y | Z% | ✅/⚠️/❌ | | UI Integration | X/Y | Z% | ✅/⚠️/❌ | | Capability Discovery | X/Y | Z% | ✅/⚠️/❌ | | Prompt-Native Features | X/Y | Z% | ✅/⚠️/❌ | **Overall Agent-Native Score: X%** ### Status Legend - ✅ Excellent (80%+) - ⚠️ Partial (50-79%) - ❌ Needs Work (<50%) ### Top 10 Recommendations by Impact | Priority | Action | Principle | Effort | |----------|--------|-----------|--------| ### What's Working Excellently [List top 5 strengths] ``` ## Success Criteria - [ ] All 8 sub-agents complete their audits - [ ] Each principle has a specific numeric score (X/Y format) - [ ] Summary table shows all scores and status indicators - [ ] Top 10 recommendations are prioritized by impact - [ ] Report identifies both strengths and gaps ## Optional: Single Principle Audit If $ARGUMENTS specifies a single principle (e.g., "action parity"), only run that sub-agent and provide detailed findings for that principle alone. Valid arguments: - `action parity` or `1` - `tools` or `primitives` or `2` - `context` or `injection` or `3` - `shared` or `workspace` or `4` - `crud` or `5` - `ui` or `integration` or `6` - `discovery` or `7` - `prompt` or `features` or `8`
More from EveryInc/compound-engineering-plugin
- ce-agent-native-architectureBuild applications where agents are first-class citizens. Use this skill when designing autonomous agents, creating MCP tools, implementing self-modifying systems, or building apps where features are outcomes achieved by agents operating in a loop.
- ce-brainstormExplore requirements and approaches through collaborative dialogue, then write a right-sized requirements document. Use when the user says "let''s brainstorm", "what should we build", or "help me think through X", presents a vague or ambitious feature request, or seems unsure about scope or direction -- even without explicitly asking to brainstorm.
- ce-clean-gone-branchesClean up local branches whose remote tracking branch is gone. Use when the user says "clean up branches", "delete gone branches", "prune local branches", "clean gone", or wants to remove stale local branches that no longer exist on the remote. Also handles removing associated worktrees for branches that have them.
- ce-code-reviewStructured code review using tiered persona agents, confidence-gated findings, and a merge/dedup pipeline. In interactive mode it applies safe, verified fixes and commits them when the working tree is clean (it never pushes); in mode:agent it reports only and the caller applies. Use when reviewing code changes before creating a PR.
- ce-commitCreate a git commit with a clear, value-communicating message. Use when the user says "commit", "commit this", "save my changes", "create a commit", or wants to commit staged or unstaged work. Produces well-structured commit messages that follow repo conventions when they exist, and defaults to conventional commit format otherwise.
- ce-commit-push-prCommit, push, and open a PR with an adaptive, value-first description that scales in depth with the change. Use when the user says "commit and PR", "ship this", "create a PR", or "open a pull request". Also handles description-only flows ("write a PR description", "rewrite the PR body", "describe this PR") without committing or pushing.
- ce-compoundDocument a recently solved problem to compound your team's knowledge or CONCEPTS.md, the project's shared domain vocabulary.
- ce-compound-refreshRefresh stale learning and pattern docs under docs/solutions/ by reviewing them against the current codebase, then updating, consolidating, or deleting drifted ones. Use when the user asks to "refresh my learnings", "audit docs/solutions/", "clean up stale learnings", or "consolidate overlapping docs", or when ce-compound flags an older doc as superseded. Do not trigger for general refactor, debugging, or code-review work unless the user has explicitly pointed at docs/solutions/.
- ce-debugSystematically find root causes and fix bugs. Use when debugging errors, investigating test failures, reproducing bugs from issue trackers (GitHub, Linear, Jira), or when stuck on a problem after failed fix attempts. Also use when the user says ''debug this'', ''why is this failing'', ''fix this bug'', ''trace this error'', or pastes stack traces, error messages, or issue references.
- ce-demo-reelCapture a visual demo reel (GIF, terminal recording, screenshots) for PR descriptions. Use when shipping UI changes, CLI features, or any work with observable behavior that benefits from visual proof. Also use when asked to add a demo, record a GIF, screenshot a feature, show what changed visually, create a demo reel, capture evidence, add proof to a PR, or create a before/after comparison.