condition-normalization
$
npx mdskill add yogsoth-ai/de-anthropocentric-research-engine/condition-normalizationStandardizes experimental conditions for fair comparison across research papers
- Solves the problem of inconsistent evaluation settings in academic papers
- Uses structured extraction and comparison of training, hardware, and hyperparameter data
- Builds difference matrices and normalization rules based on variance and impact analysis
- Delivers condition-normalized scores and structured comparison reports
SKILL.md
.github/skills/condition-normalizationView on GitHub ↗
--- name: condition-normalization description: Compare and standardize experimental conditions across papers execution: tactic used-by: baseline-establishment --- # Condition Normalization ## Purpose Build a structured understanding of how experimental conditions vary across papers, then define normalization schemes that enable fair method-to-method comparison. Addresses the fundamental problem that papers evaluate under different settings, making raw score comparison misleading. ## Stages ### Stage 1: Condition Extraction For each method-paper pair, extract all evaluation conditions: - Training data: size, version, preprocessing, augmentation - Hardware: GPU type, memory, training time - Hyperparameters: learning rate schedule, batch size, epochs - Evaluation protocol: data split version, ensembling, post-processing - Random seeds: number of runs, seed selection, variance reported **Yield**: Condition vectors per method-paper pair. ### Stage 2: Difference Matrix Build a matrix showing which conditions differ across methods: - Identify dimensions with high variance across papers - Identify dimensions that are controlled (same across all) - Quantify the impact of each dimension on reported scores (if literature exists) **Yield**: Condition difference matrix with impact annotations. ### Stage 3: Normalization Scheme Define rules for adjusting scores to common conditions: - Compute-normalized comparison (score per FLOP) - Data-normalized comparison (score per training example) - Time-normalized comparison (score per GPU-hour) - Identify which normalizations are valid vs. speculative **Yield**: Normalization rule set with validity bounds. ### Stage 4: Fair Comparison Baseline Apply normalization to produce fair comparison subsets: - Controlled comparison: methods evaluated under identical conditions - Adjusted comparison: methods with score adjustments applied - Pareto frontier: compute-vs-performance optimal set **Yield**: Fair comparison tables with methodology notes. ## Minimum Yield | Metric | Floor | |--------|-------| | Condition dimensions cataloged | 5 | | Methods with full condition vectors | 10 | | Normalization rules defined | 3 | | Fair comparison sets produced | 2 | ## SOPs Used - condition-cataloging (for Stage 1) - compute-normalization (for Stage 3) - performance-table-assembly (for Stage 4)
More from yogsoth-ai/de-anthropocentric-research-engine
- abductive-hypothesis-generationStrategy: 面对异常的最佳解释推理
- ablation-brainstormRemove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
- ablation-component-mappingMap system architecture to ablatable units for ablation studies
- ablation-designDesign ablation studies to isolate component contributions in ML systems
- ablation-executionRemove components one by one from a system, record the response/impact of each removal.
- abp-vulnerability-classificationClassify assumptions on 2 axes — load-bearing (how much conclusion depends on it) × vulnerable (how likely to be false). Focuses attention on High-Load × High-Vulnerable quadrant.
- abstraction-extractionExtract abstract principles from concrete domain cases. Strips domain-specific details to reveal transferable mechanisms.
- abstraction-ladderPerform bisociation at multiple abstraction levels
- abstraction-ladderingMove between concrete and abstract framings — 3 levels up (Why?) and 3 levels down (How?) to find the most productive research level.
- abstraction-to-designAbstract biological principle to design principle. Bridge from biology to engineering.