aeon
$
npx mdskill add K-Dense-AI/scientific-agent-skills/aeonClassify, forecast, and detect anomalies in temporal data sequences.
- Enables classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search on time-indexed observations.
- Integrates with scikit-learn compatible APIs for univariate and multivariate time series analysis.
- Executes specialized algorithms beyond standard ML approaches for temporal data and sequential patterns.
- Delivers results through Python toolkit execution with state-of-the-art temporal machine learning algorithms.
SKILL.md
.github/skills/aeonView on GitHub ↗
---
name: aeon
description: This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algorithms beyond standard ML approaches. Particularly suited for univariate and multivariate time series analysis with scikit-learn compatible APIs.
license: BSD-3-Clause license
metadata:
skill-author: K-Dense Inc.
---
# Aeon Time Series Machine Learning
## Overview
Aeon is a scikit-learn compatible Python toolkit for time series machine learning. It provides state-of-the-art algorithms for classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search.
## When to Use This Skill
Apply this skill when:
- Classifying or predicting from time series data
- Detecting anomalies or change points in temporal sequences
- Clustering similar time series patterns
- Forecasting future values
- Finding repeated patterns (motifs) or unusual subsequences (discords)
- Comparing time series with specialized distance metrics
- Extracting features from temporal data
## Installation
```bash
uv pip install aeon
```
## Core Capabilities
### 1. Time Series Classification
Categorize time series into predefined classes. See `references/classification.md` for complete algorithm catalog.
**Quick Start:**
```python
from aeon.classification.convolution_based import RocketClassifier
from aeon.datasets import load_classification
# Load data
X_train, y_train = load_classification("GunPoint", split="train")
X_test, y_test = load_classification("GunPoint", split="test")
# Train classifier
clf = RocketClassifier(n_kernels=10000)
clf.fit(X_train, y_train)
accuracy = clf.score(X_test, y_test)
```
**Algorithm Selection:**
- **Speed + Performance**: `MiniRocketClassifier`, `Arsenal`
- **Maximum Accuracy**: `HIVECOTEV2`, `InceptionTimeClassifier`
- **Interpretability**: `ShapeletTransformClassifier`, `Catch22Classifier`
- **Small Datasets**: `KNeighborsTimeSeriesClassifier` with DTW distance
### 2. Time Series Regression
Predict continuous values from time series. See `references/regression.md` for algorithms.
**Quick Start:**
```python
from aeon.regression.convolution_based import RocketRegressor
from aeon.datasets import load_regression
X_train, y_train = load_regression("Covid3Month", split="train")
X_test, y_test = load_regression("Covid3Month", split="test")
reg = RocketRegressor()
reg.fit(X_train, y_train)
predictions = reg.predict(X_test)
```
### 3. Time Series Clustering
Group similar time series without labels. See `references/clustering.md` for methods.
**Quick Start:**
```python
from aeon.clustering import TimeSeriesKMeans
clusterer = TimeSeriesKMeans(
n_clusters=3,
distance="dtw",
averaging_method="ba"
)
labels = clusterer.fit_predict(X_train)
centers = clusterer.cluster_centers_
```
### 4. Forecasting
Predict future time series values. See `references/forecasting.md` for forecasters.
**Quick Start:**
```python
from aeon.forecasting.arima import ARIMA
forecaster = ARIMA(order=(1, 1, 1))
forecaster.fit(y_train)
y_pred = forecaster.predict(fh=[1, 2, 3, 4, 5])
```
### 5. Anomaly Detection
Identify unusual patterns or outliers. See `references/anomaly_detection.md` for detectors.
**Quick Start:**
```python
from aeon.anomaly_detection import STOMP
detector = STOMP(window_size=50)
anomaly_scores = detector.fit_predict(y)
# Higher scores indicate anomalies
threshold = np.percentile(anomaly_scores, 95)
anomalies = anomaly_scores > threshold
```
### 6. Segmentation
Partition time series into regions with change points. See `references/segmentation.md`.
**Quick Start:**
```python
from aeon.segmentation import ClaSPSegmenter
segmenter = ClaSPSegmenter()
change_points = segmenter.fit_predict(y)
```
### 7. Similarity Search
Find similar patterns within or across time series. See `references/similarity_search.md`.
**Quick Start:**
```python
from aeon.similarity_search import StompMotif
# Find recurring patterns
motif_finder = StompMotif(window_size=50, k=3)
motifs = motif_finder.fit_predict(y)
```
## Feature Extraction and Transformations
Transform time series for feature engineering. See `references/transformations.md`.
**ROCKET Features:**
```python
from aeon.transformations.collection.convolution_based import RocketTransformer
rocket = RocketTransformer()
X_features = rocket.fit_transform(X_train)
# Use features with any sklearn classifier
from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier()
clf.fit(X_features, y_train)
```
**Statistical Features:**
```python
from aeon.transformations.collection.feature_based import Catch22
catch22 = Catch22()
X_features = catch22.fit_transform(X_train)
```
**Preprocessing:**
```python
from aeon.transformations.collection import MinMaxScaler, Normalizer
scaler = Normalizer() # Z-normalization
X_normalized = scaler.fit_transform(X_train)
```
## Distance Metrics
Specialized temporal distance measures. See `references/distances.md` for complete catalog.
**Usage:**
```python
from aeon.distances import dtw_distance, dtw_pairwise_distance
# Single distance
distance = dtw_distance(x, y, window=0.1)
# Pairwise distances
distance_matrix = dtw_pairwise_distance(X_train)
# Use with classifiers
from aeon.classification.distance_based import KNeighborsTimeSeriesClassifier
clf = KNeighborsTimeSeriesClassifier(
n_neighbors=5,
distance="dtw",
distance_params={"window": 0.2}
)
```
**Available Distances:**
- **Elastic**: DTW, DDTW, WDTW, ERP, EDR, LCSS, TWE, MSM
- **Lock-step**: Euclidean, Manhattan, Minkowski
- **Shape-based**: Shape DTW, SBD
## Deep Learning Networks
Neural architectures for time series. See `references/networks.md`.
**Architectures:**
- Convolutional: `FCNClassifier`, `ResNetClassifier`, `InceptionTimeClassifier`
- Recurrent: `RecurrentNetwork`, `TCNNetwork`
- Autoencoders: `AEFCNClusterer`, `AEResNetClusterer`
**Usage:**
```python
from aeon.classification.deep_learning import InceptionTimeClassifier
clf = InceptionTimeClassifier(n_epochs=100, batch_size=32)
clf.fit(X_train, y_train)
predictions = clf.predict(X_test)
```
## Datasets and Benchmarking
Load standard benchmarks and evaluate performance. See `references/datasets_benchmarking.md`.
**Load Datasets:**
```python
from aeon.datasets import load_classification, load_regression
# Classification
X_train, y_train = load_classification("ArrowHead", split="train")
# Regression
X_train, y_train = load_regression("Covid3Month", split="train")
```
**Benchmarking:**
```python
from aeon.benchmarking import get_estimator_results
# Compare with published results
published = get_estimator_results("ROCKET", "GunPoint")
```
## Common Workflows
### Classification Pipeline
```python
from aeon.transformations.collection import Normalizer
from aeon.classification.convolution_based import RocketClassifier
from sklearn.pipeline import Pipeline
pipeline = Pipeline([
('normalize', Normalizer()),
('classify', RocketClassifier())
])
pipeline.fit(X_train, y_train)
accuracy = pipeline.score(X_test, y_test)
```
### Feature Extraction + Traditional ML
```python
from aeon.transformations.collection import RocketTransformer
from sklearn.ensemble import GradientBoostingClassifier
# Extract features
rocket = RocketTransformer()
X_train_features = rocket.fit_transform(X_train)
X_test_features = rocket.transform(X_test)
# Train traditional ML
clf = GradientBoostingClassifier()
clf.fit(X_train_features, y_train)
predictions = clf.predict(X_test_features)
```
### Anomaly Detection with Visualization
```python
from aeon.anomaly_detection import STOMP
import matplotlib.pyplot as plt
detector = STOMP(window_size=50)
scores = detector.fit_predict(y)
plt.figure(figsize=(15, 5))
plt.subplot(2, 1, 1)
plt.plot(y, label='Time Series')
plt.subplot(2, 1, 2)
plt.plot(scores, label='Anomaly Scores', color='red')
plt.axhline(np.percentile(scores, 95), color='k', linestyle='--')
plt.show()
```
## Best Practices
### Data Preparation
1. **Normalize**: Most algorithms benefit from z-normalization
```python
from aeon.transformations.collection import Normalizer
normalizer = Normalizer()
X_train = normalizer.fit_transform(X_train)
X_test = normalizer.transform(X_test)
```
2. **Handle Missing Values**: Impute before analysis
```python
from aeon.transformations.collection import SimpleImputer
imputer = SimpleImputer(strategy='mean')
X_train = imputer.fit_transform(X_train)
```
3. **Check Data Format**: Aeon expects shape `(n_samples, n_channels, n_timepoints)`
### Model Selection
1. **Start Simple**: Begin with ROCKET variants before deep learning
2. **Use Validation**: Split training data for hyperparameter tuning
3. **Compare Baselines**: Test against simple methods (1-NN Euclidean, Naive)
4. **Consider Resources**: ROCKET for speed, deep learning if GPU available
### Algorithm Selection Guide
**For Fast Prototyping:**
- Classification: `MiniRocketClassifier`
- Regression: `MiniRocketRegressor`
- Clustering: `TimeSeriesKMeans` with Euclidean
**For Maximum Accuracy:**
- Classification: `HIVECOTEV2`, `InceptionTimeClassifier`
- Regression: `InceptionTimeRegressor`
- Forecasting: `ARIMA`, `TCNForecaster`
**For Interpretability:**
- Classification: `ShapeletTransformClassifier`, `Catch22Classifier`
- Features: `Catch22`, `TSFresh`
**For Small Datasets:**
- Distance-based: `KNeighborsTimeSeriesClassifier` with DTW
- Avoid: Deep learning (requires large data)
## Reference Documentation
Detailed information available in `references/`:
- `classification.md` - All classification algorithms
- `regression.md` - Regression methods
- `clustering.md` - Clustering algorithms
- `forecasting.md` - Forecasting approaches
- `anomaly_detection.md` - Anomaly detection methods
- `segmentation.md` - Segmentation algorithms
- `similarity_search.md` - Pattern matching and motif discovery
- `transformations.md` - Feature extraction and preprocessing
- `distances.md` - Time series distance metrics
- `networks.md` - Deep learning architectures
- `datasets_benchmarking.md` - Data loading and evaluation tools
## Additional Resources
- Documentation: https://www.aeon-toolkit.org/
- GitHub: https://github.com/aeon-toolkit/aeon
- Examples: https://www.aeon-toolkit.org/en/stable/examples.html
- API Reference: https://www.aeon-toolkit.org/en/stable/api_reference.html
More from K-Dense-AI/scientific-agent-skills
- adaptyvHow to use the Adaptyv Bio Foundry API and Python SDK for protein experiment design, submission, and results retrieval. Use this skill whenever the user mentions Adaptyv, Foundry API, protein binding assays, protein screening experiments, BLI/SPR assays, thermostability assays, or wants to submit protein sequences for experimental characterization. Also trigger when code imports `adaptyv`, `adaptyv_sdk`, or `FoundryClient`, or references `foundry-api-public.adaptyvbio.com`.
- anndataData structure for annotated matrices in single-cell analysis. Use when working with .h5ad files or integrating with the scverse ecosystem. This is the data format skill—for analysis workflows use scanpy; for probabilistic models use scvi-tools; for population-scale queries use cellxgene-census.
- arboretoInfer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3). Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions. Supports distributed computation for large-scale datasets.
- astropyComprehensive Python library for astronomy and astrophysics. This skill should be used when working with astronomical data including celestial coordinates, physical units, FITS files, cosmological calculations, time systems, tables, world coordinate systems (WCS), and astronomical data analysis. Use when tasks involve coordinate transformations, unit conversions, FITS file manipulation, cosmological distance calculations, time scale conversions, or astronomical data processing.
- autoskillObserve the user's screen via screenpipe, detect repeated research workflows, match them against existing scientific-agent-skills, and draft new skills (or composition recipes that chain existing ones) for the patterns not yet covered. Use when the user asks to analyze their recent work and propose skills based on what they actually do. Requires the screenpipe daemon (https://github.com/screenpipe/screenpipe) running locally on port 3030 — the skill has no other data source and will refuse to run if screenpipe is unreachable. All detection runs locally; only redacted cluster summaries reach the LLM.
- benchling-integrationBenchling R&D platform integration. Access registry (DNA, proteins), inventory, ELN entries, workflows via API, build Benchling Apps, query Data Warehouse, for lab data management automation.
- bgpt-paper-searchSearch scientific papers and retrieve structured experimental data extracted from full-text studies via the BGPT MCP server. Returns 25+ fields per paper including methods, results, sample sizes, quality scores, and conclusions. Use for literature reviews, evidence synthesis, and finding experimental details not available in abstracts alone.
- biopythonComprehensive molecular biology toolkit. Use for sequence manipulation, file parsing (FASTA/GenBank/PDB), phylogenetics, and programmatic NCBI/PubMed access (Bio.Entrez). Best for batch processing, custom bioinformatics pipelines, BLAST automation. For quick lookups use gget; for multi-service integration use bioservices.
- bioservicesUnified Python interface to 40+ bioinformatics services. Use when querying multiple databases (UniProt, KEGG, ChEMBL, Reactome) in a single workflow with consistent API. Best for cross-database analysis, ID mapping across services. For quick single-database lookups use gget; for sequence/file manipulation use biopython.
- cellxgene-censusQuery the CELLxGENE Census (61M+ cells) programmatically. Use when you need expression data across tissues, diseases, or cell types from the largest curated single-cell atlas. Best for population-scale queries, reference atlas comparisons. For analyzing your own data use scanpy or scvi-tools.