agentic-engineering
$
npx mdskill add affaan-m/ECC/agentic-engineeringExecute engineering tasks with eval-first quality gates and cost-aware routing.
- Handles complex implementation work requiring rigorous verification and risk control.
- Depends on internal eval frameworks and regression testing pipelines.
- Selects model tiers based on task complexity and architectural depth.
- Delivers verified code units with clear completion criteria and risk signatures.
SKILL.md
.github/skills/agentic-engineeringView on GitHub ↗
--- name: agentic-engineering description: Operate as an agentic engineer using eval-first execution, decomposition, and cost-aware model routing. origin: ECC --- # Agentic Engineering Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls. ## Operating Principles 1. Define completion criteria before execution. 2. Decompose work into agent-sized units. 3. Route model tiers by task complexity. 4. Measure with evals and regression checks. ## Eval-First Loop 1. Define capability eval and regression eval. 2. Run baseline and capture failure signatures. 3. Execute implementation. 4. Re-run evals and compare deltas. ## Task Decomposition Apply the 15-minute unit rule: - each unit should be independently verifiable - each unit should have a single dominant risk - each unit should expose a clear done condition ## Model Routing - Haiku: classification, boilerplate transforms, narrow edits - Sonnet: implementation and refactors - Opus: architecture, root-cause analysis, multi-file invariants ## Session Strategy - Continue session for closely-coupled units. - Start fresh session after major phase transitions. - Compact after milestone completion, not during active debugging. ## Review Focus for AI-Generated Code Prioritize: - invariants and edge cases - error boundaries - security and auth assumptions - hidden coupling and rollout risk Do not waste review cycles on style-only disagreements when automated format/lint already enforce style. ## Cost Discipline Track per task: - model - token estimate - retries - wall-clock time - success/failure Escalate model tier only when lower tier fails with a clear reasoning gap.
More from affaan-m/ECC
- accessibilityDesign, implement, and audit inclusive digital products using WCAG 2.2 Level AA
- agent-architecture-auditFull-stack diagnostic for agent and LLM applications. Audits the 12-layer agent stack for wrapper regression, memory pollution, tool discipline failures, hidden repair loops, and rendering corruption. Produces severity-ranked findings with code-first fixes. Essential for developers building agent applications, autonomous loops, or any LLM-powered feature.
- agent-evalHead-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics
- agent-harness-constructionDesign and optimize AI agent action spaces, tool definitions, and observation formatting for higher completion rates.
- agent-introspection-debuggingStructured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
- agent-payment-x402Add x402 payment execution to AI agents with per-task budgets, spending controls, and non-custodial wallets. Supports Base through agentwallet-sdk and X Layer through OKX Payments / OKX Agent Payments Protocol.
- agent-sortBuild an evidence-backed ECC install plan for a specific repo by sorting skills, commands, rules, hooks, and extras into DAILY vs LIBRARY buckets using parallel repo-aware review passes. Use when ECC should be trimmed to what a project actually needs instead of loading the full bundle.
- agentic-osBuild persistent multi-agent operating systems on Claude Code. Covers kernel architecture, specialist agents, slash commands, file-based memory, scheduled automation, and state management without external databases.
- ai-first-engineeringEngineering operating model for teams where AI agents generate a large share of implementation output.
- ai-regression-testingRegression testing strategies for AI-assisted development. Sandbox-mode API testing without database dependencies, automated bug-check workflows, and patterns to catch AI blind spots where the same model writes and reviews code.