pairwise-ranking

Name: pairwise-ranking
Author: yogsoth-ai/de-anthropocentric-research-engine

$npx mdskill add yogsoth-ai/de-anthropocentric-research-engine/pairwise-ranking

Produce global rankings using pairwise comparisons and voting aggregation methods

Solve ranking problems with incomplete or sparse comparison data
Uses methods like Elo, Bradley-Terry, TrueSkill, and Schulze
Routes strategies based on dataset size, comparison density, and update frequency
Delivers calibrated rankings with confidence estimates and consistency checks

SKILL.md

.github/skills/pairwise-rankingView on GitHub ↗

---
name: pairwise-ranking
description: Pairwise Ranking Campaign — produce global rankings through pairwise comparisons and voting aggregation using Bradley-Terry, Elo, TrueSkill, Condorcet, Borda, Schulze methods.
execution: campaign
used-by: convergence
---

# Pairwise Ranking

Produce global rankings from pairwise comparisons. This campaign orchestrates comparison collection, rating computation, multi-judge aggregation, and consistency verification to yield robust ordinal rankings with confidence estimates.

## Strategy Routing

| Signal | Strategy |
|--------|----------|
| Small N precise comparison / 5-15 options / careful calibration | deliberative-calibration |
| Large N sparse / 100+ options / can only compare subset | efficient-exploration |
| Multi-judge / committee / multi-judge / LLM judge aggregation | collective-adjudication |
| Continuous update / Elo / live rating / A/B testing | dynamic-tracking |
| Consistency audit / cycle detection / transitivity check | coherence-diagnosis |

## Manifest

### Strategies

| Strategy | Methods | When |
|----------|---------|------|
| deliberative-calibration | Bradley-Terry, Thurstone, AHP pairwise, Borda | Small N complete comparison |
| efficient-exploration | BT incomplete, TrueSkill, Active learning, Rank Centrality | Large N sparse matrix |
| collective-adjudication | Condorcet/Schulze, Borda, Kemeny-Young, Copeland | Multi-judge aggregation |
| dynamic-tracking | Elo, Glicko-2, TrueSkill 2, Whole-History Rating | Continuous rating update |
| coherence-diagnosis | Consistency Ratio, cycle enumeration, mElo | Preference consistency check |

### Tactics

| Tactic | Purpose |
|--------|---------|
| adaptive-pair-selection | Iteratively select maximally informative pairs, compare, update ratings, check convergence |
| multi-judge-aggregation | Collect independent ballots from multiple judges, aggregate, identify disagreement |
| consistency-audit-loop | Detect cycles, localize inconsistencies, request corrections, recompute |

### SOPs

| SOP | Input | Output |
|-----|-------|--------|
| pair-selector | current_ratings, comparison_history | next_pairs[] |
| comparison-executor | pair, context | judgment(winner, confidence, reasoning) |
| rating-update | judgment, current_ratings, method | updated_ratings |
| convergence-check | rating_history | converged(bool), stability_score |
| ballot-collection | candidates[], perspectives[] | ballots[] |
| aggregation-method | ballots[], method | consensus_ranking |
| cycle-detection | comparison_matrix | cycles[], transitivity_score |
| inconsistency-localization | comparison_matrix, cycles[] | problematic_pairs[] |
| ranking-synthesis | ratings, consistency_report | final_ranking |

## Budget Table (M tier)

| Dimension | Threshold |
|-----------|-----------|
| Comparison pairs | >= N*log(N) pairs (N=candidate count) |
| Judge count (collective) | >=3 independent perspectives |
| Consistency check | CR < 0.1 or equivalent threshold |
| Convergence criterion | ranking stability >= 90% |

## MCP Tools

- `mcp__wiki-vault__vault_search` — retrieve candidate descriptions and prior rankings
- `mcp__wiki-vault__vault_add_edge` — record ranking relationships
- `mcp__wiki-vault__vault_query_graph` — check existing preference edges

## Context Management

- State is maintained in a `ranking_state` ledger passed between tactics
- Each SOP receives only its required inputs (no full state leakage)
- Convergence check gates iteration termination
- Final synthesis produces the deliverable ranking artifact

More from yogsoth-ai/de-anthropocentric-research-engine

Skill	Description
abductive-hypothesis-generation	Strategy: 面对异常的最佳解释推理
ablation-brainstorm	Remove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
ablation-component-mapping	Map system architecture to ablatable units for ablation studies
ablation-design	Design ablation studies to isolate component contributions in ML systems
ablation-execution	Remove components one by one from a system, record the response/impact of each removal.
abp-vulnerability-classification	Classify assumptions on 2 axes — load-bearing (how much conclusion depends on it) × vulnerable (how likely to be false). Focuses attention on High-Load × High-Vulnerable quadrant.
abstraction-extraction	Extract abstract principles from concrete domain cases. Strips domain-specific details to reveal transferable mechanisms.
abstraction-ladder	Perform bisociation at multiple abstraction levels
abstraction-laddering	Move between concrete and abstract framings — 3 levels up (Why?) and 3 levels down (How?) to find the most productive research level.
abstraction-to-design	Abstract biological principle to design principle. Bridge from biology to engineering.