performance-benchmarking

Name: performance-benchmarking
Author: elophanto/EloPhanto

$npx mdskill add elophanto/EloPhanto/performance-benchmarking

Run load tests to validate system performance against SLAs.

Detects bottlenecks and optimizes Core Web Vitals for faster page loads.
Uses shell_execute and browser_navigate to gather performance metrics.
Analyzes test data to recommend cost-effective optimization strategies.
Delivers clear reports with actionable insights for capacity planning.

SKILL.md

.github/skills/performance-benchmarkingView on GitHub ↗

---
name: performance-benchmarking
description: Measure, analyze, and improve system performance with load testing, Core Web Vitals optimization, and capacity planning. Adapted from msitarzewski/agency-agents.
---

## Triggers

- performance benchmark
- load testing
- stress testing
- Core Web Vitals
- page speed
- response time
- throughput testing
- scalability test
- performance optimization
- capacity planning
- LCP optimization
- performance budget
- endurance testing
- performance SLA
- bottleneck analysis

## Instructions

### Performance Baseline and Requirements
- Establish current performance baselines across all system components using `shell_execute`
- Define performance requirements and SLA targets with stakeholder alignment
- Identify critical user journeys and high-impact performance scenarios
- Set up performance monitoring infrastructure and data collection
- Use `browser_navigate` for Core Web Vitals measurement

### Comprehensive Testing Strategy
- Design test scenarios covering load, stress, spike, and endurance testing
- Create realistic test data and user behavior simulation
- Plan test environment setup that mirrors production characteristics
- Implement statistical analysis methodology for reliable results

### Performance Analysis and Optimization
- Execute comprehensive performance testing with detailed metrics collection
- Identify bottlenecks through systematic analysis of results
- Provide optimization recommendations with cost-benefit analysis
- Validate optimization effectiveness with before/after comparisons
- Use `knowledge_write` to store performance baselines and optimization patterns

### Core Web Vitals Optimization
- Optimize for Largest Contentful Paint (LCP < 2.5s)
- Optimize for First Input Delay (FID < 100ms)
- Optimize for Cumulative Layout Shift (CLS < 0.1)
- Implement code splitting, lazy loading, and CDN optimization
- Monitor Real User Monitoring (RUM) data alongside synthetic metrics
- Use `web_search` for performance optimization techniques and benchmarks

### Methodology Standards
- Always establish baseline performance before optimization attempts
- Use statistical analysis with confidence intervals for measurements
- Test under realistic load conditions simulating actual user behavior
- Consider performance impact of every optimization recommendation
- Prioritize user-perceived performance over technical metrics alone
- Test across different network conditions and device capabilities

## Deliverables

### Performance Analysis Report Template

```markdown
# [System Name] Performance Analysis Report

## Performance Test Results
**Load Testing**: [Normal load performance with detailed metrics]
**Stress Testing**: [Breaking point analysis and recovery behavior]
**Scalability Testing**: [Performance under increasing load scenarios]
**Endurance Testing**: [Long-term stability and memory leak analysis]

## Core Web Vitals Analysis
**Largest Contentful Paint**: [LCP measurement with optimization recommendations]
**First Input Delay**: [FID analysis with interactivity improvements]
**Cumulative Layout Shift**: [CLS measurement with stability enhancements]
**Speed Index**: [Visual loading progress optimization]

## Bottleneck Analysis
**Database Performance**: [Query optimization and connection pooling analysis]
**Application Layer**: [Code hotspots and resource utilization]
**Infrastructure**: [Server, network, and CDN performance analysis]
**Third-Party Services**: [External dependency impact assessment]

## Performance ROI Analysis
**Optimization Costs**: [Implementation effort and resource requirements]
**Performance Gains**: [Quantified improvements in key metrics]
**Business Impact**: [User experience improvement and conversion impact]
**Cost Savings**: [Infrastructure optimization and efficiency gains]

## Optimization Recommendations
**High-Priority**: [Critical optimizations with immediate impact]
**Medium-Priority**: [Significant improvements with moderate effort]
**Long-Term**: [Strategic optimizations for future scalability]

---
**Performance Status**: [MEETS/FAILS SLA requirements]
**Scalability Assessment**: [Ready/Needs Work for projected growth]
```

### k6 Load Test Configuration

```javascript
export const options = {
  stages: [
    { duration: '2m', target: 10 },   // Warm up
    { duration: '5m', target: 50 },   // Normal load
    { duration: '2m', target: 100 },  // Peak load
    { duration: '5m', target: 100 },  // Sustained peak
    { duration: '2m', target: 200 },  // Stress test
    { duration: '3m', target: 0 },    // Cool down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],
    http_req_failed: ['rate<0.01'],
  },
};
```

## Success Metrics

- 95% of systems consistently meet or exceed performance SLA requirements
- Core Web Vitals scores achieve "Good" rating for 90th percentile users
- Performance optimization delivers 25% improvement in key user experience metrics
- System scalability supports 10x current load without significant degradation
- Performance monitoring prevents 90% of performance-related incidents

## Verify

- The test suite was actually executed and exit code/output is captured in the transcript, not just authored
- Pass/fail counts are reported as numbers (e.g., '42 passed, 0 failed'), not 'all tests pass'
- New tests cover at least one negative/edge case in addition to the happy path; the cases are listed
- Coverage delta or affected modules are reported when the project tracks coverage; a baseline number is cited
- For flaky or timing-sensitive tests, the run was repeated at least 3 times and pass-rate is reported
- Any skipped or xfail tests introduced are listed with a reason and an issue/TODO link

More from elophanto/EloPhanto

Skill	Description
12-principles-of-animation	Audit animation code against Disney's 12 principles adapted for web. Use when reviewing motion, implementing animations, or checking animation quality. Outputs file:line findings.
accessibility-auditing	Audit interfaces against WCAG 2.2 standards, test with assistive technologies, and ensure inclusive design beyond what automated tools catch. Adapted from msitarzewski/agency-agents.
agency-phase-0-discovery	Intelligence and discovery phase — validate opportunity before committing resources. Adapted from msitarzewski/agency-agents.
agency-phase-1-strategy	Strategy and architecture phase — define what to build, how to structure it, and what success looks like. Adapted from msitarzewski/agency-agents.
agency-phase-2-foundation	Foundation and scaffolding phase — build technical and operational foundation before feature development. Adapted from msitarzewski/agency-agents.
agency-phase-3-build	Build and iterate phase — implement all features through continuous Dev-QA loops with orchestrated multi-agent sprints. Adapted from msitarzewski/agency-agents.
agency-phase-4-hardening	Quality and hardening phase — the final quality gauntlet proving production readiness with evidence. Adapted from msitarzewski/agency-agents.
agency-phase-5-launch	Launch and growth phase — coordinate go-to-market execution across all channels for maximum impact. Adapted from msitarzewski/agency-agents.
agency-phase-6-operate	Operate and evolve phase — sustained operations with continuous improvement for live products. Adapted from msitarzewski/agency-agents.
agency-strategy	NEXUS multi-agent orchestration strategy — the complete operational playbook for coordinating specialized AI agents across project phases. Adapted from msitarzewski/agency-agents.