benchmark
$
npx mdskill add affaan-m/ECC/benchmarkMeasure performance baselines and detect regressions instantly.
- Identify slow pages and API latency before deployment.
- Integrates browser MCP for Core Web Vitals and API load testing.
- Executes automated tests against specific targets and SLAs.
- Delivers detailed metrics including latency percentiles and bundle sizes.
SKILL.md
.github/skills/benchmarkView on GitHub ↗
--- name: benchmark description: Use this skill to measure performance baselines, detect regressions before/after PRs, and compare stack alternatives. origin: ECC --- # Benchmark — Performance Baseline & Regression Detection ## When to Use - Before and after a PR to measure performance impact - Setting up performance baselines for a project - When users report "it feels slow" - Before a launch — ensure you meet performance targets - Comparing your stack against alternatives ## How It Works ### Mode 1: Page Performance Measures real browser metrics via browser MCP: ``` 1. Navigate to each target URL 2. Measure Core Web Vitals: - LCP (Largest Contentful Paint) — target < 2.5s - CLS (Cumulative Layout Shift) — target < 0.1 - INP (Interaction to Next Paint) — target < 200ms - FCP (First Contentful Paint) — target < 1.8s - TTFB (Time to First Byte) — target < 800ms 3. Measure resource sizes: - Total page weight (target < 1MB) - JS bundle size (target < 200KB gzipped) - CSS size - Image weight - Third-party script weight 4. Count network requests 5. Check for render-blocking resources ``` ### Mode 2: API Performance Benchmarks API endpoints: ``` 1. Hit each endpoint 100 times 2. Measure: p50, p95, p99 latency 3. Track: response size, status codes 4. Test under load: 10 concurrent requests 5. Compare against SLA targets ``` ### Mode 3: Build Performance Measures development feedback loop: ``` 1. Cold build time 2. Hot reload time (HMR) 3. Test suite duration 4. TypeScript check time 5. Lint time 6. Docker build time ``` ### Mode 4: Before/After Comparison Run before and after a change to measure impact: ``` /benchmark baseline # saves current metrics # ... make changes ... /benchmark compare # compares against baseline ``` Output: ``` | Metric | Before | After | Delta | Verdict | |--------|--------|-------|-------|---------| | LCP | 1.2s | 1.4s | +200ms | WARNING: WARN | | Bundle | 180KB | 175KB | -5KB | ✓ BETTER | | Build | 12s | 14s | +2s | WARNING: WARN | ``` ## Output Stores baselines in `.ecc/benchmarks/` as JSON. Git-tracked so the team shares baselines. ## Integration - CI: run `/benchmark compare` on every PR - Pair with `/canary-watch` for post-deploy monitoring - Pair with `/browser-qa` for full pre-ship checklist
More from affaan-m/ECC
- accessibilityDesign, implement, and audit inclusive digital products using WCAG 2.2 Level AA
- agent-architecture-auditFull-stack diagnostic for agent and LLM applications. Audits the 12-layer agent stack for wrapper regression, memory pollution, tool discipline failures, hidden repair loops, and rendering corruption. Produces severity-ranked findings with code-first fixes. Essential for developers building agent applications, autonomous loops, or any LLM-powered feature.
- agent-evalHead-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics
- agent-harness-constructionDesign and optimize AI agent action spaces, tool definitions, and observation formatting for higher completion rates.
- agent-introspection-debuggingStructured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
- agent-payment-x402Add x402 payment execution to AI agents with per-task budgets, spending controls, and non-custodial wallets. Supports Base through agentwallet-sdk and X Layer through OKX Payments / OKX Agent Payments Protocol.
- agent-sortBuild an evidence-backed ECC install plan for a specific repo by sorting skills, commands, rules, hooks, and extras into DAILY vs LIBRARY buckets using parallel repo-aware review passes. Use when ECC should be trimmed to what a project actually needs instead of loading the full bundle.
- agentic-engineeringOperate as an agentic engineer using eval-first execution, decomposition, and cost-aware model routing.
- agentic-osBuild persistent multi-agent operating systems on Claude Code. Covers kernel architecture, specialist agents, slash commands, file-based memory, scheduled automation, and state management without external databases.
- ai-first-engineeringEngineering operating model for teams where AI agents generate a large share of implementation output.