multi-judge-aggregation
$
npx mdskill add yogsoth-ai/de-anthropocentric-research-engine/multi-judge-aggregationCollect independent ballots from multiple judges or perspectives, aggregate them into a consensus ranking using social choice theory, and surface disagreement patterns for further investigation.
SKILL.md
.github/skills/multi-judge-aggregationView on GitHub ↗
--- name: multi-judge-aggregation description: Collect independent rankings from multiple judges, aggregate using social choice methods, and identify disagreement hotspots. execution: tactic used-by: pairwise-ranking --- # Multi-Judge Aggregation Collect independent ballots from multiple judges or perspectives, aggregate them into a consensus ranking using social choice theory, and surface disagreement patterns for further investigation. ## Stages 1. **Collect** — ballot-collection gathers independent rankings from each judge/perspective 2. **Aggregate** — aggregation-method applies social choice method to produce consensus 3. **Audit** — cycle-detection checks for Condorcet cycles in the aggregated preference matrix ## Available SOPs | Stage | SOP | Input | Output | |-------|-----|-------|--------| | Collect | ballot-collection | candidates[], perspectives[] | ballots[] | | Aggregate | aggregation-method | ballots[], method | consensus_ranking | | Audit | cycle-detection | comparison_matrix | cycles[], transitivity_score | ## Execution Guidance - Ensure judges evaluate independently (no anchoring or information leakage) - Use ≥3 judges for meaningful aggregation - Default to Schulze method (satisfies many desirable social choice properties) - Cross-validate with Borda count as sanity check - Flag pairs where judge agreement < 60% as disagreement hotspots - If Condorcet cycles exist, report them explicitly — do not silently resolve ## Minimum Yield - Consensus ranking + disagreement heatmap - Consensus ranking with method used and confidence - Disagreement heatmap: for each pair, what fraction of judges agree - Condorcet winner identification (or explicit statement of cycle) - Per-judge deviation from consensus (who disagrees most, on what)