aeon-autoresearch
$
npx mdskill add BankrBot/skills/aeon-autoresearchEvolve underperforming skills by generating four improved variations.
- Fixes low-signal output, deprecated APIs, and stale methodologies.
- Depends on target SKILL.md and weighted scoring rubrics.
- Selects the highest-scoring variation based on improvement metrics.
- Applies the winner cleanly without downgrading existing functionality.
SKILL.md
.github/skills/aeon-autoresearchView on GitHub ↗
---
name: aeon-autoresearch
description: |
Evolve any installed skill by generating four variations along separate theses (better inputs /
sharper output / more robust / rethink), scoring them on a weighted rubric, and applying the
winner. Never downgrades a working skill — aborts cleanly if no variation improves the original.
Use when an installed skill is producing low-signal output, hitting deprecated APIs, or feels
stale.
Triggers: "improve this skill", "evolve $skill", "auto-research my X", "regenerate variations".
---
# aeon-autoresearch
Self-improvement loop. Given a target SKILL.md, generates four parallel improved variations, scores each, applies the winner.
## Inputs
| Param | Description |
|---|---|
| `target` | Skill name or path to SKILL.md. Required. |
| `mode` | `evolve` (default) writes the diff. `dry-run` scores and prints, writes nothing. |
## The four variations
- **A — Better inputs**: replace deprecated APIs, add fallbacks, fix broken endpoints.
- **B — Sharper output**: tighter format, signal over noise, explicit verdicts, banned filler.
- **C — More robust**: empty-data handling, retries, dedup state, rate-limit awareness.
- **D — Rethink**: fundamentally different methodology for the same goal.
Each is a complete runnable SKILL.md. Frontmatter shape preserved.
## Scoring
1-5 per axis, weighted total max 50:
| Axis | Weight |
|---|---|
| Improvement vs original | 3× |
| Output value | 2× |
| Clarity, data quality, robustness | 1.5× each |
| Conventions | 1× |
Tie-break (within 2 points): prefer the variation making the biggest single improvement over many small ones.
## Safety guarantee
If every variation scores ≤ original on **Improvement**, the skill aborts with `AUTORESEARCH_NO_IMPROVEMENT`. No file written. Working skills are never downgraded.
Preserves the original's core purpose, frontmatter shape, and declared env vars.
## Versioning
Inside a git repo, changes land in a branch (`autoresearch/${target}`) — operator reviews the diff before merging. Outside a repo, the original is preserved at `${target}/SKILL.md.before-autoresearch` for rollback.
## Output
The diff, plus a report with the scoring table for all four variations and a one-paragraph rationale for the winner.
Pairs with `aeon-skill-evals` (surfaces what's underperforming) and `aeon-skill-repair` (deterministic bugs; autoresearch handles quality lifts).
More from BankrBot/skills