adaptive-wfo-epoch
$
npx mdskill add terrylica/cc-skills/adaptive-wfo-epochOptimizes training epochs in Walk-Forward Optimization using efficiency metrics
- Solves overfitting and inefficient epoch selection in WFO workflows
- Uses Read, Grep, Glob, and Bash tools for data and process management
- Evaluates Walk-Forward Efficiency (WFE) to select optimal epoch ranges
- Returns log-spaced epoch candidates and efficient frontier metrics
SKILL.md
.github/skills/adaptive-wfo-epochView on GitHub ↗
---
name: adaptive-wfo-epoch
description: Adaptive epoch selection for Walk-Forward Optimization. TRIGGERS - WFO epoch, epoch selection, WFE optimization, overfitting epochs.
allowed-tools: Read, Grep, Glob, Bash
---
# Adaptive Walk-Forward Epoch Selection (AWFES)
Machine-readable reference for adaptive epoch selection within Walk-Forward Optimization (WFO). Optimizes training epochs per-fold using Walk-Forward Efficiency (WFE) as the objective.
> **Self-Evolving Skill**: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.
## When to Use This Skill
Use this skill when:
- Selecting optimal training epochs for ML models in WFO
- Avoiding overfitting via Walk-Forward Efficiency metrics
- Implementing per-fold adaptive epoch selection
- Computing efficient frontiers for epoch-performance trade-offs
- Carrying epoch priors across WFO folds
## Quick Start
```python
from adaptive_wfo_epoch import AWFESConfig, compute_efficient_frontier
# Generate epoch candidates from search bounds and granularity
config = AWFESConfig.from_search_space(
min_epoch=100,
max_epoch=2000,
granularity=5, # Number of frontier points
)
# config.epoch_configs → [100, 211, 447, 945, 2000] (log-spaced)
# Per-fold epoch sweep
for fold in wfo_folds:
epoch_metrics = []
for epoch in config.epoch_configs:
is_sharpe, oos_sharpe = train_and_evaluate(fold, epochs=epoch)
wfe = config.compute_wfe(is_sharpe, oos_sharpe, n_samples=len(fold.train))
epoch_metrics.append({"epoch": epoch, "wfe": wfe, "is_sharpe": is_sharpe})
# Select from efficient frontier
selected_epoch = compute_efficient_frontier(epoch_metrics)
# Carry forward to next fold as prior
prior_epoch = selected_epoch
```
## Methodology Overview
### What This Is
Per-fold adaptive epoch selection where:
1. Train models across a range of epochs (e.g., 400, 800, 1000, 2000)
2. Compute WFE = OOS_Sharpe / IS_Sharpe for each epoch count
3. Find the "efficient frontier" - epochs maximizing WFE vs training cost
4. Select optimal epoch from frontier for OOS evaluation
5. Carry forward as prior for next fold
### What This Is NOT
- **NOT early stopping**: Early stopping monitors validation loss continuously; this evaluates discrete candidates post-hoc
- **NOT Bayesian optimization**: No surrogate model; direct evaluation of all candidates
- **NOT nested cross-validation**: Uses temporal WFO, not shuffled splits
## Academic Foundations
| Concept | Citation | Key Insight |
| --------------------------- | ------------------------------ | ------------------------------------------------- |
| Walk-Forward Efficiency | Pardo (1992, 2008) | WFE = OOS_Return / IS_Return as robustness metric |
| Deflated Sharpe Ratio | Bailey & López de Prado (2014) | Adjusts for multiple testing |
| Pareto-Optimal HP Selection | Bischl et al. (2023) | Multi-objective hyperparameter optimization |
| Warm-Starting | Nomura & Ono (2021) | Transfer knowledge between optimization runs |
See [references/academic-foundations.md](./references/academic-foundations.md) for full literature review.
## Core Formula: Walk-Forward Efficiency
```python
def compute_wfe(
is_sharpe: float,
oos_sharpe: float,
n_samples: int | None = None,
) -> float | None:
"""Walk-Forward Efficiency - measures performance transfer.
WFE = OOS_Sharpe / IS_Sharpe
Interpretation (guidelines, not hard thresholds):
- WFE ≥ 0.70: Excellent transfer (low overfitting)
- WFE 0.50-0.70: Good transfer
- WFE 0.30-0.50: Moderate transfer (investigate)
- WFE < 0.30: Severe overfitting (likely reject)
The IS_Sharpe minimum is derived from signal-to-noise ratio,
not a fixed magic number. See compute_is_sharpe_threshold().
Reference: Pardo (2008) "The Evaluation and Optimization of Trading Strategies"
"""
# Data-driven threshold: IS_Sharpe must exceed 2σ noise floor
min_is_sharpe = compute_is_sharpe_threshold(n_samples) if n_samples else 0.1
if abs(is_sharpe) < min_is_sharpe:
return None
return oos_sharpe / is_sharpe
```
## Principled Configuration Framework
All parameters are derived from first principles or data characteristics. `AWFESConfig` provides unified configuration with log-spaced epoch generation, Bayesian variance derivation from search space, and market-specific annualization factors.
See [references/configuration-framework.md](./references/configuration-framework.md) for the full `AWFESConfig` class and `compute_is_sharpe_threshold()` implementation.
## Guardrails (Principled Guidelines)
- **G1: WFE Thresholds** - 0.30 (reject), 0.50 (warning), 0.70 (target) based on practitioner consensus
- **G2: IS_Sharpe Minimum** - Data-driven threshold: `2/sqrt(n)` adapts to sample size
- **G3: Stability Penalty** - Adaptive threshold derived from WFE variance prevents epoch churn
- **G4: DSR Adjustment** - Deflated Sharpe corrects for epoch selection multiplicity via Gumbel distribution
See [references/guardrails.md](./references/guardrails.md) for full implementations of all guardrails.
## WFE Aggregation Methods
Under the null hypothesis, WFE follows a **Cauchy distribution** (no defined mean). Always prefer median or pooled methods:
- **Pooled WFE**: Precision-weighted by sample size (best for variable fold sizes)
- **Median WFE**: Robust to outliers (best for suspected regime changes)
- **Weighted Mean**: Inverse-variance weighting (best for homogeneous folds)
See [references/wfe-aggregation.md](./references/wfe-aggregation.md) for implementations and selection guide.
## Efficient Frontier Algorithm
Pareto-optimal epoch selection: an epoch is on the frontier if no other epoch dominates it (better WFE AND lower training time). The `AdaptiveEpochSelector` class maintains state across folds with adaptive stability penalties.
See [references/efficient-frontier.md](./references/efficient-frontier.md) for the full algorithm and carry-forward mechanism.
## Anti-Patterns
| Anti-Pattern | Symptom | Fix | Severity |
| --------------------------------- | ----------------------------------- | --------------------------------- | -------- |
| **Expanding window (range bars)** | Train size grows per fold | Use fixed sliding window | CRITICAL |
| **Peak picking** | Best epoch always at sweep boundary | Expand range, check for plateau | HIGH |
| **Insufficient folds** | effective_n < 30 | Increase folds or data span | HIGH |
| **Ignoring temporal autocorr** | Folds correlated | Use purged CV, gap between folds | HIGH |
| **Overfitting to IS** | IS >> OOS Sharpe | Reduce epochs, add regularization | HIGH |
| **sqrt(252) for crypto** | Inflated Sharpe | Use sqrt(365) or sqrt(7) weekly | MEDIUM |
| **Single epoch selection** | No uncertainty quantification | Report confidence interval | MEDIUM |
| **Meta-overfitting** | Epoch selection itself overfits | Limit to 3-4 candidates max | HIGH |
**CRITICAL**: Never use expanding window for range bar ML training. See [references/anti-patterns.md](./references/anti-patterns.md) for the full analysis (Section 7).
## Decision Tree
See [references/epoch-selection-decision-tree.md](./references/epoch-selection-decision-tree.md) for the full practitioner decision tree.
```
Start
│
├─ IS_Sharpe > compute_is_sharpe_threshold(n)? ──NO──> Mark WFE invalid, use fallback
│ │ (threshold = 2/√n, adapts to sample size)
│ YES
│ │
├─ Compute WFE for each epoch
│ │
├─ Any WFE > 0.30? ──NO──> REJECT all epochs (severe overfit)
│ │ (guideline, not hard threshold)
│ YES
│ │
├─ Compute efficient frontier
│ │
├─ Apply AdaptiveStabilityPenalty
│ │ (threshold derived from WFE variance)
└─> Return selected epoch
```
## Integration with rangebar-eval-metrics
This skill extends [rangebar-eval-metrics](../rangebar-eval-metrics/SKILL.md):
| Metric Source | Used For | Reference |
| --------------------- | ---------------------------------------- | ---------------------------------------------------------------------------------------- |
| `sharpe_tw` | WFE numerator (OOS) and denominator (IS) | [range-bar-metrics.md](./references/range-bar-metrics.md) |
| `n_bars` | Sample size for aggregation weights | [metrics-schema.md](../rangebar-eval-metrics/references/metrics-schema.md) |
| `psr`, `dsr` | Final acceptance criteria | [sharpe-formulas.md](../rangebar-eval-metrics/references/sharpe-formulas.md) |
| `prediction_autocorr` | Validate model isn't collapsed | [ml-prediction-quality.md](../rangebar-eval-metrics/references/ml-prediction-quality.md) |
| `is_collapsed` | Model health check | [ml-prediction-quality.md](../rangebar-eval-metrics/references/ml-prediction-quality.md) |
| Extended risk metrics | Deep risk analysis (optional) | [risk-metrics.md](../rangebar-eval-metrics/references/risk-metrics.md) |
### Recommended Workflow
1. **Compute base metrics** using `rangebar-eval-metrics:compute_metrics.py`
2. **Feed to AWFES** for epoch selection with `sharpe_tw` as primary signal
3. **Validate** with `psr > 0.85` and `dsr > 0.50` before deployment
4. **Monitor** `is_collapsed` and `prediction_autocorr` for model health
---
## OOS Application Phase
AWFES uses **Nested WFO** with three data splits per fold (Train 60% / Val 20% / Test 20%) with 6% embargo gaps at each boundary. The per-fold workflow: epoch sweep on train, WFE computation on validation, Bayesian update, final model training on train+val, evaluation on test.
See [references/oos-workflow.md](./references/oos-workflow.md) for the complete workflow with diagrams, `BayesianEpochSelector` class, and `apply_awfes_to_test()` implementation. Also see [references/oos-application.md](./references/oos-application.md) for the extended reference.
## Epoch Smoothing Methods
Bayesian updating (recommended) provides principled, uncertainty-aware smoothing. Alternatives include EMA and SMA. Initialization via `AWFESConfig.from_search_space()` derives variances from the epoch range automatically.
See [references/epoch-smoothing-methods.md](./references/epoch-smoothing-methods.md) for all methods, formulas, and initialization strategies. See [references/epoch-smoothing.md](./references/epoch-smoothing.md) for extended mathematical analysis.
## OOS Metrics Specification
Three-tier metric hierarchy for test evaluation:
- **Tier 1 (Primary)**: `sharpe_tw`, `hit_rate`, `cumulative_pnl`, `positive_sharpe_folds`, `wfe_test`
- **Tier 2 (Risk)**: `max_drawdown`, `calmar_ratio`, `profit_factor`, `cvar_10pct`
- **Tier 3 (Statistical)**: `psr`, `dsr`, `binomial_pvalue`, `hac_ttest_pvalue`
See [references/oos-metrics-implementation.md](./references/oos-metrics-implementation.md) for full metric tables, `compute_oos_metrics()`, and fold aggregation code. See [references/oos-metrics.md](./references/oos-metrics.md) for threshold justifications.
## Look-Ahead Bias Prevention
**CRITICAL (v3 fix)**: TEST must use `prior_bayesian_epoch` (from prior folds only), NOT `val_optimal_epoch`. The Bayesian update happens AFTER test evaluation, ensuring information flows only from past to present.
See [references/look-ahead-bias-v3.md](./references/look-ahead-bias-v3.md) for the v3 fix details, embargo requirements, validation checklist, and anti-patterns. See [references/look-ahead-bias.md](./references/look-ahead-bias.md) for detailed examples.
---
## References
| Topic | Reference File |
| ------------------------ | --------------------------------------------------------------------------------- |
| Academic Literature | [academic-foundations.md](./references/academic-foundations.md) |
| Mathematical Formulation | [mathematical-formulation.md](./references/mathematical-formulation.md) |
| Configuration Framework | [configuration-framework.md](./references/configuration-framework.md) |
| Guardrails | [guardrails.md](./references/guardrails.md) |
| WFE Aggregation | [wfe-aggregation.md](./references/wfe-aggregation.md) |
| Efficient Frontier | [efficient-frontier.md](./references/efficient-frontier.md) |
| Decision Tree | [epoch-selection-decision-tree.md](./references/epoch-selection-decision-tree.md) |
| Anti-Patterns | [anti-patterns.md](./references/anti-patterns.md) |
| OOS Workflow | [oos-workflow.md](./references/oos-workflow.md) |
| OOS Application | [oos-application.md](./references/oos-application.md) |
| Epoch Smoothing Methods | [epoch-smoothing-methods.md](./references/epoch-smoothing-methods.md) |
| Epoch Smoothing Analysis | [epoch-smoothing.md](./references/epoch-smoothing.md) |
| OOS Metrics Impl | [oos-metrics-implementation.md](./references/oos-metrics-implementation.md) |
| OOS Metrics Thresholds | [oos-metrics.md](./references/oos-metrics.md) |
| Look-Ahead Bias (v3) | [look-ahead-bias-v3.md](./references/look-ahead-bias-v3.md) |
| Look-Ahead Bias Examples | [look-ahead-bias.md](./references/look-ahead-bias.md) |
| **Feature Sets** | [feature-sets.md](./references/feature-sets.md) |
| **xLSTM Implementation** | [xlstm-implementation.md](./references/xlstm-implementation.md) |
| **Range Bar Metrics** | [range-bar-metrics.md](./references/range-bar-metrics.md) |
| Troubleshooting | [troubleshooting.md](./references/troubleshooting.md) |
### Related Skills
| Skill | Relationship |
| -------------------------------------------------------------------------------- | --------------------------------------------------- |
| [sharpe-ratio-non-iid-corrections](../sharpe-ratio-non-iid-corrections/SKILL.md) | Generalized Sharpe variance, DSR for WFE validation |
| [opendeviation-eval-metrics](../opendeviation-eval-metrics/SKILL.md) | Metric definitions consumed by WFE |
## Full Citations
- Bailey, D. H., & López de Prado, M. (2014). The deflated Sharpe ratio: Correcting for selection bias, backtest overfitting and non-normality. _The Journal of Portfolio Management_, 40(5), 94-107.
- Bischl, B., et al. (2023). Multi-Objective Hyperparameter Optimization in Machine Learning. _ACM Transactions on Evolutionary Learning and Optimization_.
- López de Prado, M. (2018). _Advances in Financial Machine Learning_. Wiley. Chapter 7.
- Nomura, M., & Ono, I. (2021). Warm Starting CMA-ES for Hyperparameter Optimization. _AAAI Conference on Artificial Intelligence_.
- Pardo, R. E. (2008). _The Evaluation and Optimization of Trading Strategies, 2nd Edition_. John Wiley & Sons.
## Post-Execution Reflection
After this skill completes, check before closing:
1. **Did the command succeed?** — If not, fix the instruction or error table that caused the failure.
2. **Did parameters or output change?** — If the underlying tool's interface drifted, update Usage examples and Parameters table to match.
3. **Was a workaround needed?** — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.
Only update if the issue is real and reproducible — not speculative.