skill-benchmark
$
npx mdskill add HoangNguyen0403/agent-skills-standard/skill-benchmark> [!IMPORTANT] > Benchmark AI skill effectiveness by measuring implementation quality against legacy constraints.
SKILL.md
.github/skills/skill-benchmarkView on GitHub ↗
---
name: skill-benchmark
description: "Benchmark AI skill effectiveness by measuring implementation quality against legacy constraints."
metadata:
triggers:
keywords:
- skill benchmark
- workflow
---
# Skill Benchmark Skill
> [!IMPORTANT]
> Benchmark AI skill effectiveness by measuring implementation quality against legacy constraints.
## Instructions
When the user asks to perform this workflow, execute the following steps:
# 📊 Skill Benchmark Orchestrator
> **Goal**: Quantify how much active skills improve implementation quality. Deliver a prioritized compliance delta and skill applicability report.
---
## Step 1 — Project Context & Active Skills
Identify the tech stack and all active skills in `AGENTS.md`.
```bash
# 1. Total source files and lines changed
find src -name "*.ts" -o -name "*.tsx" | xargs wc -l 2>/dev/null | sort -rn | head -20
# 2. Check active skill registry
cat AGENTS.md | head -80
```
---
## Step 2 — Auto-Select a Legacy Trap
Pick the file automatically. Rank candidates by the severity of anti-patterns:
- 🔴 **P0**: Hardcoded secrets; Logic inside UI components.
- 🟠 **P1**: Wrong Router pattern; Global state for local concerns; Missing design tokens.
- 🟡 **P2**: Raw user-facing strings (i18n).
---
## Step 3 — Build Eval-Driven Scorecard
Source your scorecard from `evals/evals.json`, not from hardcoded patterns.
Follow the Scorecard Rubric in `<SKILLS>/common/common-skill-creator/references/benchmark.md` when synced:
1. Read `<SKILLS>/<category>/<skill>/evals/evals.json`.
2. Generate columns for **Failure Pattern** and **Success Pattern**.
3. Refactor the file, citing the exact skill rule for each change.
---
## Step 4 — Benchmark Report & Compliance Delta
Output the scorecard and compliant score using the templates in `<SKILLS>/common/common-skill-creator/references/benchmark.md` when synced.
- **Compliance Score Before vs After**.
- **Δ Delta: +Z%** 🚀.
- **Eval Alignment**: How well does the skill teach what the eval tests?
---
## Step 5 — Skill Applicability & Iteration
For every `❌ FAIL`, identify the root cause using the **Iteration Table** in:
`<SKILLS>/common/common-skill-creator/references/benchmark.md` when synced.
1. Signal not matching file? → Refine trigger.
2. Rule too vague? → Add Anti-Pattern rule.
3. Conflict? → Ensure P0 overrides P1.
### Suggested .skillsrc Exclusions
Recommend any skills that are noisy or non-applicable for the project.
```yaml
exclude:
- [skill-id] # reason
```
More from HoangNguyen0403/agent-skills-standard
- android-agp-upgradeUpgrade an Android project to Android Gradle Plugin (AGP) 9. Use when migrating to AGP 9, updating Gradle build files, migrating to built-in Kotlin, or adopting the new AGP DSL.
- android-architectureApply Clean Architecture layering, modularization, and Unidirectional Data Flow in Android projects. Use when setting up project structure, placing code in layers, configuring feature/core modules, or implementing UDF patterns.
- android-background-workImplement WorkManager and background processing correctly on Android. Use when creating Worker classes, scheduling tasks, choosing between WorkManager and Foreground Services, or setting up Hilt in workers.
- android-composeBuild high-performance declarative UI with Jetpack Compose. Use when writing Composable functions, optimizing recomposition, hoisting state, or working with LazyColumn and side effects.
- android-compose-migrationMigrate an Android XML View to Jetpack Compose following a structured 10-step workflow. Use when converting XML layouts to Compose, setting up Compose in an existing View-based project, or incrementally adopting Compose.
- android-concurrencyWrite correct coroutine scopes, Flow collection, and dispatcher injection in Android. Use when writing suspend functions, choosing between StateFlow and SharedFlow, or injecting Dispatchers for testability.
- android-deploymentConfigure release signing, R8 obfuscation, and App Bundle publishing for Android. Use when setting up signing configs, enabling minification, adding ProGuard keep rules, or preparing for Play Store submission.
- android-design-systemEnforce Material Design 3 theming and design token usage in Jetpack Compose. Use when implementing M3 components, color schemes, typography, or design tokens.
- android-diConfigure Hilt dependency injection with proper scoping, modules, and constructor injection in Android. Use when setting up Hilt DI, defining modules, or configuring component scoping.
- android-edge-to-edgeMigrate a Jetpack Compose app to edge-to-edge display and fix system bar inset issues. Use when UI components are obscured by navigation/status bars, fixing IME insets, or enabling edge-to-edge for SDK 35+.