mlflow-python
$
npx mdskill add terrylica/cc-skills/mlflow-pythonTracks MLflow experiments and logs backtest metrics using Python API
- Logs backtest results and strategy parameters for MLflow experiments
- Uses MLflow Python API and QuantStats for financial metrics
- Triggers based on keywords like 'log backtest' or 'search runs'
- Returns structured experiment data and run metrics to the agent
SKILL.md
.github/skills/mlflow-pythonView on GitHub ↗
---
name: mlflow-python
description: MLflow experiment tracking via Python API. TRIGGERS - MLflow metrics, log backtest, experiment tracking, search runs.
allowed-tools: Read, Bash, Grep, Glob
---
# MLflow Python Skill
Unified read/write MLflow operations via Python API with QuantStats integration for comprehensive trading metrics.
**ADR**: [2025-12-12-mlflow-python-skill](/docs/adr/2025-12-12-mlflow-python-skill.md)
> **Note**: This skill uses Pandas (MLflow API requires it). The `mlflow-python` path is auto-skipped by the Polars preference hook.
> **Self-Evolving Skill**: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.
## When to Use This Skill
**CAN Do**:
- Log backtest metrics (Sharpe, max_drawdown, total_return, etc.)
- Log experiment parameters (strategy config, timeframes)
- Create and manage experiments
- Query runs with SQL-like filtering
- Calculate 70+ trading metrics via QuantStats
- Retrieve metric history (time-series data)
**CANNOT Do**:
- Direct database access to MLflow backend
- Artifact storage management (S3/GCS configuration)
- MLflow server administration
## Prerequisites
### Authentication Setup
MLflow uses separate environment variables for credentials (NOT embedded in URI):
```bash
# Option 1: mise + .env.local (recommended)
# Create .env.local in skill directory with:
MLFLOW_TRACKING_URI=http://mlflow.eonlabs.com:5000
MLFLOW_TRACKING_USERNAME=eonlabs
MLFLOW_TRACKING_PASSWORD=<password>
# Option 2: Direct environment variables
export MLFLOW_TRACKING_URI="http://mlflow.eonlabs.com:5000"
export MLFLOW_TRACKING_USERNAME="eonlabs"
export MLFLOW_TRACKING_PASSWORD="<password>"
```
### Verify Connection
```bash
/usr/bin/env bash << 'SKILL_SCRIPT_EOF'
cd ${CLAUDE_PLUGIN_ROOT}/skills/mlflow-python
uv run scripts/query_experiments.py experiments
SKILL_SCRIPT_EOF
```
## Quick Start Workflows
### A. Log Backtest Results (Primary Use Case)
```bash
/usr/bin/env bash << 'SKILL_SCRIPT_EOF_2'
cd ${CLAUDE_PLUGIN_ROOT}/skills/mlflow-python
uv run scripts/log_backtest.py \
--experiment "crypto-backtests" \
--run-name "btc_momentum_v2" \
--returns path/to/returns.csv \
--params '{"strategy": "momentum", "timeframe": "1h"}'
SKILL_SCRIPT_EOF_2
```
### B. Search Experiments
```bash
uv run scripts/query_experiments.py experiments
```
### C. Query Runs with Filter
```bash
uv run scripts/query_experiments.py runs \
--experiment "crypto-backtests" \
--filter "metrics.sharpe_ratio > 1.5" \
--order-by "metrics.sharpe_ratio DESC"
```
### D. Create New Experiment
```bash
uv run scripts/create_experiment.py \
--name "crypto-backtests-2025" \
--description "Q1 2025 cryptocurrency trading strategy backtests"
```
### E. Get Metric History
```bash
uv run scripts/get_metric_history.py \
--run-id abc123 \
--metrics sharpe_ratio,cumulative_return
```
## QuantStats Metrics Available
The `log_backtest.py` script calculates 70+ metrics via QuantStats, including:
| Category | Metrics |
| ------------ | ----------------------------------------------------------------- |
| **Ratios** | sharpe, sortino, calmar, omega, treynor |
| **Returns** | cagr, total_return, avg_return, best, worst |
| **Drawdown** | max_drawdown, avg_drawdown, drawdown_days |
| **Trade** | win_rate, profit_factor, payoff_ratio, consecutive_wins/losses |
| **Risk** | volatility, var, cvar, ulcer_index, serenity_index |
| **Advanced** | kelly_criterion, recovery_factor, risk_of_ruin, information_ratio |
See [quantstats-metrics.md](./references/quantstats-metrics.md) for full list.
## Bundled Scripts
| Script | Purpose |
| ----------------------- | -------------------------------------------- |
| `log_backtest.py` | Log backtest returns with QuantStats metrics |
| `query_experiments.py` | Search experiments and runs (replaces CLI) |
| `create_experiment.py` | Create new experiment with metadata |
| `get_metric_history.py` | Retrieve metric time-series data |
## Configuration
The skill uses mise `[env]` pattern for configuration. See `.mise.toml` for defaults.
Create `.env.local` (gitignored) for credentials:
```bash
MLFLOW_TRACKING_URI=http://mlflow.eonlabs.com:5000
MLFLOW_TRACKING_USERNAME=eonlabs
MLFLOW_TRACKING_PASSWORD=<password>
```
## Reference Documentation
- [Authentication Patterns](./references/authentication.md) - Idiomatic MLflow auth
- [QuantStats Metrics](./references/quantstats-metrics.md) - Full list of 70+ metrics
- [Query Patterns](./references/query-patterns.md) - DataFrame operations
- [Migration from CLI](./references/migration-from-cli.md) - CLI to Python API mapping
## Migration from mlflow-query
This skill replaces the CLI-based `mlflow-query` skill. Key differences:
| Feature | mlflow-query (old) | mlflow-python (new) |
| -------------- | ------------------ | ---------------------- |
| Log metrics | Not supported | `mlflow.log_metrics()` |
| Log params | Not supported | `mlflow.log_params()` |
| Query runs | CLI text parsing | DataFrame output |
| Metric history | Workaround only | Native support |
| Auth pattern | Embedded in URI | Separate env vars |
See [migration-from-cli.md](./references/migration-from-cli.md) for detailed mapping.
---
## Troubleshooting
| Issue | Cause | Solution |
| ----------------------- | ---------------------------- | --------------------------------------------------- |
| Connection refused | MLflow server not running | Verify MLFLOW_TRACKING_URI and server status |
| Authentication failed | Wrong credentials | Check MLFLOW_TRACKING_USERNAME and PASSWORD in .env |
| Experiment not found | Experiment name typo | Run `query_experiments.py experiments` to list all |
| QuantStats import error | Missing dependency | `uv add quantstats` in skill directory |
| Pandas import warning | Expected for this skill | Ignore - MLflow requires Pandas (hook-excluded) |
| Run creation fails | Experiment doesn't exist | Use `create_experiment.py` to create first |
| Metric history empty | Wrong run_id or metric name | Verify run_id with `query_experiments.py runs` |
| Returns CSV parse error | Wrong date format or columns | Check CSV has date index and returns column |
## Post-Execution Reflection
After this skill completes, check before closing:
1. **Did the command succeed?** — If not, fix the instruction or error table that caused the failure.
2. **Did parameters or output change?** — If the underlying tool's interface drifted, update Usage examples and Parameters table to match.
3. **Was a workaround needed?** — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.
Only update if the issue is real and reproducible — not speculative.