tessl-skill-review

Name: tessl-skill-review
Author: aAAaqwq/AGI-Super-Team

$npx mdskill add aAAaqwq/AGI-Super-Team/tessl-skill-review

Score skills against Tessl to ensure quality and readiness.

Measures skill quality using automated evaluation and best practices.
Depends on Tessl CLI for automated scoring and review workflows.
Decides readiness by analyzing structure, triggering, and instructions.
Delivers structured feedback for improvement and revision comparison.

SKILL.md

.github/skills/tessl-skill-reviewView on GitHub ↗

---
name: tessl-skill-review
description: Evaluate, score, and review an Agent Skill or SKILL.md using Tessl as the primary evaluator. Use when asked to measure skill quality, score a skill, review a skill against best practices, compare before/after skill revisions, or generate structured improvement feedback for a skill directory or SKILL.md file.
---

# Tessl Skill Review

Use Tessl as the default scoring engine for skill quality review.

This skill is for **measuring quality**, not just giving vibes. Prefer Tessl-backed review first, then add your own judgment on top.

## Primary use cases

Use this skill when asked to:
- score a skill
- evaluate a `SKILL.md`
- review a skill against best practices
- compare two versions of a skill
- decide whether a skill is ready to publish
- find weaknesses in skill triggering, structure, or instructions

## Core workflow

### 1) Identify the review target

Accept either:
- a skill directory containing `SKILL.md`
- a direct path to `SKILL.md`
- a repo path with one or more skills to audit

If the request is ambiguous, clarify which skill or directory to score.

### 2) Prefer Tessl review first

If Tessl CLI is available, start with:

```bash
tessl skill review <path>
```

Useful examples:

```bash
tessl skill review ~/.openclaw/skills/meta-cognition
tessl skill review ./skills/work-to-skill
tessl skill review ./skills/some-skill/SKILL.md
```

If the exact CLI surface has drifted, inspect:

```bash
tessl --help
tessl skill --help
```

If Tessl is not installed, either:

1. install it with:

```bash
curl -fsSL https://get.tessl.io | sh
```

2. or use the bundled helper:

```bash
scripts/review.sh <path>
```

The helper script will detect whether Tessl exists, print the install command if missing, and run `tessl skill review <path>` when available.

### 3) Extract a structured scorecard

From Tessl review output, capture at least:
- overall score
- strongest areas
- weakest areas
- trigger/description quality issues
- instruction clarity issues
- missing examples / weak workflow guidance
- context bloat or redundancy risks
- publish-readiness judgment

If Tessl returns category scores, preserve them verbatim where possible.

## Manual fallback rubric

If Tessl cannot be installed or executed, do a manual review using this scoring rubric.

Score each dimension from **1-5**:
- **Trigger clarity**: does the description clearly say what the skill does and when to use it?
- **Workflow executability**: can another agent follow the steps without guessing?
- **Context efficiency**: is the skill lean, or does it waste context?
- **Reusability**: does it avoid hidden tribal knowledge and local-only assumptions?
- **Safety**: does it properly constrain risky or irreversible actions?

Convert to a 100-point score:

```text
Total = (sum of 5 dimension scores / 25) * 100
```

Verdict bands:
- **90-100**: publish-ready
- **75-89**: strong, but improve a few areas
- **60-74**: useful, but needs substantial revision
- **<60**: not ready

Always state clearly whether the score came from **Tessl** or from the **manual fallback rubric**.

## Secondary workflow: scenario-based evaluation

When the user wants deeper validation, go beyond `skill review` and run scenario evals.

Use Tessl scenario tooling when the question is not just “is this well-written?” but “does this skill actually improve agent performance?”

Preferred flow:

```bash
tessl scenario generate <path>
tessl scenario run <path-or-scenario>
```

Use scenario evals for:
- regression checks after editing a skill
- comparing two versions of a skill
- checking whether extra context actually helps
- judging real task success rather than surface quality only

## What to look for in your analysis

After Tessl output, add your own judgment across these dimensions:

### 1. Trigger quality
- Is the frontmatter description specific enough to trigger reliably?
- Does it say both **what the skill does** and **when to use it**?
- Is it too vague, too generic, or too narrow?

### 2. Workflow quality
- Are the steps executable?
- Does the skill guide the agent through decisions, not just dump information?
- Are fragile steps sufficiently constrained?

### 3. Context efficiency
- Is the body concise enough?
- Does it duplicate obvious model knowledge?
- Should detailed material move into references instead of bloating `SKILL.md`?

### 4. Reusability
- Would another agent instance be able to use this without extra tribal knowledge?
- Are assumptions, prerequisites, and inputs explicit?

### 5. Safety and overreach
- Does the skill push the agent toward risky or irreversible actions without proper checks?
- Are approval boundaries and destructive actions handled clearly?

## Output format

Use this output shape unless the user asks for another format:

```markdown
## Tessl Skill Review
- Target:
- Tessl overall score:
- Verdict: ready / close / needs work / not publishable yet

## Strengths
- ...

## Weaknesses
- ...

## High-impact fixes
1. ...
2. ...
3. ...

## Suggested rewrite areas
- frontmatter:
- workflow:
- examples:
- references/scripts:

## Final recommendation
- ...
```

## Publishing / PR use

When reviewing skills for a PR or public registry submission:
- use Tessl score as an input, not the only decision-maker
- call out any mismatch between score and real-world usefulness
- flag private-environment coupling, hardcoded paths, secret handling, or weak public readability

## Anti-patterns

Do not:
- give a score without actually running Tessl when Tessl is available
- confuse “nice writing” with “effective agent behavior”
- accept a high-level skill that has no actionable workflow
- ignore bloated context just because the prose sounds polished
- assume a skill is good only because it is long

## Quick command checklist

```bash
# install if needed
curl -fsSL https://get.tessl.io | sh

# or use the bundled helper
scripts/review.sh <path>

# inspect CLI if unsure
tessl --help
tessl skill --help

# basic quality review
tessl skill review <path>

# deeper evals when needed
tessl scenario generate <path>
tessl scenario run <path-or-scenario>
```

## Trigger phrases

- “测一下这个 skill 的评分”
- “帮我评估这个 SKILL.md”
- “这个 skill 质量怎么样”
- “用 Tessl 跑一下 skill review”
- “compare these two skill versions”
- “is this skill publish-ready?”
- “score this skill”
- “review this skill against best practices”