confidence-calibration
$
npx mdskill add yogsoth-ai/de-anthropocentric-research-engine/confidence-calibrationCalibrates confidence scores during debate progression to guide next steps
- Solves the problem of determining when to escalate, continue, or terminate a debate
- Depends on judge verdicts, confidence history, and remaining budget inputs
- Analyzes cumulative evidence and trajectory to make objective decisions
- Returns calibrated confidence, decision, reasoning, and saturation status
SKILL.md
.github/skills/confidence-calibrationView on GitHub ↗
--- name: confidence-calibration description: Calibrates confidence scores based on debate progression. Determines whether to escalate, continue, or terminate based on cumulative evidence. execution: subagent prompt: ./prompt.md input: round_verdicts (string), confidence_history (string), budget_remaining (string) used-by: [multiagent-debate] --- # Confidence Calibration Calibrates confidence based on debate progression. ## Execution Subagent — spawned via subagent-spawning/spawn-agent. ## Why Subagent Calibration requires meta-analysis of debate trajectory without being anchored to any single round's outcome. Isolated context enables objective trend assessment. ## Input - **round_verdicts**: All judge verdicts so far - **confidence_history**: Confidence scores from each round - **budget_remaining**: Rounds/searches remaining in budget ## Output - **calibrated_confidence**: Updated confidence in artifact viability (0.0–1.0) - **decision**: escalate / continue / terminate - **reasoning**: Why this decision given the trajectory - **saturation_flag**: Whether debate is producing diminishing returns ## Budget One unit = one calibration assessment per round.
More from yogsoth-ai/de-anthropocentric-research-engine
- abductive-hypothesis-generationStrategy: 面对异常的最佳解释推理
- ablation-brainstormRemove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
- ablation-component-mappingMap system architecture to ablatable units for ablation studies
- ablation-designDesign ablation studies to isolate component contributions in ML systems
- ablation-executionRemove components one by one from a system, record the response/impact of each removal.
- abp-vulnerability-classificationClassify assumptions on 2 axes — load-bearing (how much conclusion depends on it) × vulnerable (how likely to be false). Focuses attention on High-Load × High-Vulnerable quadrant.
- abstraction-extractionExtract abstract principles from concrete domain cases. Strips domain-specific details to reveal transferable mechanisms.
- abstraction-ladderPerform bisociation at multiple abstraction levels
- abstraction-ladderingMove between concrete and abstract framings — 3 levels up (Why?) and 3 levels down (How?) to find the most productive research level.
- abstraction-to-designAbstract biological principle to design principle. Bridge from biology to engineering.