bio-temporal-genomics-circadian-rhythms
$
npx mdskill add GPTomics/bioSkills/bio-temporal-genomics-circadian-rhythmsDetects circadian and ultradian rhythms in time-series omics data using statistical models
- Identifies rhythmic patterns in circadian, feeding-fasting, or light-dark cycle experiments
- Uses CosinorPy, MetaCycle (JTK_CYCLE, ARSER), and RAIN for rhythmicity analysis
- Tests significance of oscillations at pre-specified periods like 24 hours
- Returns phase, amplitude, and rhythmicity metrics for each gene or feature
SKILL.md
.github/skills/bio-temporal-genomics-circadian-rhythmsView on GitHub ↗
---
name: bio-temporal-genomics-circadian-rhythms
description: Detects circadian and ultradian rhythms in time-series omics data using CosinorPy cosinor models, MetaCycle (JTK_CYCLE, ARSER), and RAIN non-parametric tests. Fits cosine models to estimate phase and amplitude, tests rhythmicity significance at pre-specified periods. Use when testing for 24-hour or other known-period oscillations in circadian, feeding-fasting, or light-dark cycle experiments. Not for unknown-period discovery (see temporal-genomics/periodicity-detection).
tool_type: mixed
primary_tool: CosinorPy
---
## Version Compatibility
Reference examples tested with: R stats (base), pandas 2.2+, statsmodels 0.14+
Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures
- R: `packageVersion('<pkg>')` then `?function_name` to verify parameters
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# Circadian Rhythm Detection
**"Test which genes in my time-course data have circadian rhythms"** → Fit cosinor models at a specified period (typically 24h) to expression time series, estimating amplitude, phase (acrophase), and rhythmicity significance for each gene.
- Python: `CosinorPy.cosinor.fit_group()` for cosinor regression
- R: `MetaCycle::meta2d()` for multi-method rhythmicity testing (JTK_CYCLE + ARSER)
Identifies periodic gene expression patterns at known periods (typically 24h) using cosinor regression, non-parametric rhythmicity tests, and meta-analysis approaches combining multiple methods.
## Core Workflow
1. Prepare time-series expression matrix (genes x timepoints)
2. Fit cosinor models or apply rhythmicity tests at specified period
3. Extract rhythm parameters: amplitude, phase (acrophase), p-value
4. Correct for multiple testing (BH FDR)
5. Filter significant rhythmic genes and characterize phase distribution
## CosinorPy (Python)
**Goal:** Test for circadian rhythmicity in time-series expression data by fitting cosinor models at a known period (typically 24h) and estimating amplitude, phase, and significance.
**Approach:** Fit single- or multi-component cosine curves to each gene's expression profile using CosinorPy, apply batch processing across all genes, and correct p-values with BH FDR to identify significant oscillators.
### Single-Component Cosinor
Fits `y = M + A*cos(2*pi*t/T + phi)` where M = MESOR, A = amplitude, phi = acrophase.
```python
import pandas as pd
from cosinorpy import file_parser, cosinor, cosinor1
df = file_parser.read_csv('expression_timecourse.csv')
# Single-component cosinor fit for one gene
# period=24: standard circadian period in hours
# fit_me takes X (time) and Y (expression) arrays, returns a tuple
gene_data = df[df['test'] == 'test_gene']
res = cosinor.fit_me(gene_data['x'].values, gene_data['y'].values, period=24, n_components=1)
```
### Multi-Component Cosinor
Fits harmonics to capture non-sinusoidal waveforms (e.g., sharp peaks).
```python
# n_components=2: adds 12h harmonic to capture asymmetric waveforms
res_multi = cosinor.fit_me(gene_data['x'].values, gene_data['y'].values, period=24, n_components=2)
# n_components=3: adds 8h harmonic; rarely needed unless waveform is highly complex
res_3 = cosinor.fit_me(gene_data['x'].values, gene_data['y'].values, period=24, n_components=3)
```
### Population-Mean Cosinor
Combines individual fits across biological replicates or subjects.
```python
# Population-mean cosinor across multiple subjects
# Estimates group-level amplitude/phase with confidence intervals
pop_results = cosinor1.population_fit_cosinor(
df, period=24, save_to='results/'
)
```
### Batch Processing
```python
# Genome-wide cosinor analysis via fit_group
# fit_group expects long-format DataFrame with columns: 'x' (time), 'y' (expression), 'test' (gene name)
results_df = cosinor.fit_group(df, period=24)
# BH correction for multiple testing (fit_group output has 'p' column; no built-in q-values)
from statsmodels.stats.multitest import multipletests
reject, qvals, _, _ = multipletests(results_df['p'], method='fdr_bh')
results_df['q_value'] = qvals
# q < 0.05: standard FDR threshold for rhythmicity
rhythmic = results_df[results_df['q_value'] < 0.05]
```
## MetaCycle (R)
Integrates JTK_CYCLE, ARSER, and Lomb-Scargle into a single meta-analysis framework.
```r
library(MetaCycle)
# meta2d integrates results from multiple methods
# cycMethod='JTK': JTK_CYCLE is the most widely used; robust and non-parametric
# minper=20, maxper=28: search window around 24h; allows detection of near-circadian periods
# timepoints: actual sampling times in hours (must match column order)
meta2d(
infile = 'expression_matrix.csv',
filestyle = 'csv',
outdir = 'metacycle_results/',
timepoints = seq(0, 44, by = 4),
cycMethod = c('JTK', 'ARS', 'LS'),
minper = 20,
maxper = 28,
outputFile = TRUE,
outRawData = FALSE
)
# JTK_CYCLE alone (faster, suitable for evenly sampled data)
meta2d(
infile = 'expression_matrix.csv',
filestyle = 'csv',
outdir = 'jtk_results/',
timepoints = seq(0, 44, by = 4),
cycMethod = 'JTK',
minper = 24,
maxper = 24
)
```
### MetaCycle Output Interpretation (R stats (base)+)
```r
results <- read.csv('metacycle_results/meta2d_expression_matrix.csv')
# meta2d_pvalue: Fisher integration of per-method p-values
# meta2d_BH.Q: BH-corrected q-value across all genes
# meta2d_period: estimated period (hours)
# meta2d_phase: estimated peak time (hours from ZT0)
# meta2d_AMP: amplitude (half peak-to-trough distance)
# meta2d_rAMP: relative amplitude (AMP / baseline); robust across expression levels
# q < 0.05: standard FDR threshold
rhythmic <- results[results$meta2d_BH.Q < 0.05, ]
```
## RAIN (R/Bioconductor)
Non-parametric test that handles asymmetric waveforms (e.g., rapid induction, slow decay).
```r
library(rain)
# expression_mat: genes x timepoints matrix
# period=24: test for 24h rhythmicity
# deltat=4: sampling interval in hours
# nr.series=2: number of biological replicates per timepoint
# nr.series=2: samples per timepoint are grouped as replicates
results <- rain(
t(expression_mat),
period = 24,
deltat = 4,
nr.series = 2,
method = 'independent'
)
# Adjust for multiple testing
results$q_value <- p.adjust(results$pVal, method = 'BH')
# q < 0.05: standard FDR threshold for rhythmicity detection
rhythmic <- results[results$q_value < 0.05, ]
```
## DiscoRhythm (R/Bioconductor)
Comprehensive framework with both scripted and interactive (Shiny) interfaces.
```r
library(DiscoRhythm)
# Scripted analysis pipeline
se <- discoGetSimu(TRUE)
disco_results <- discoBatch(se, report = NULL, osc_period = 24)
```
## Parameter Guide
| Parameter | Typical Value | Rationale |
|-----------|---------------|-----------|
| Period | 24h | Standard circadian period; use 12h for ultradian |
| Sampling interval | 2-4h | Nyquist: must sample at least 2x per period (every 12h minimum) |
| Minimum cycles | >=2 complete cycles | One cycle cannot distinguish trend from oscillation |
| Minimum timepoints | >=6 per cycle | JTK_CYCLE requires >=6; more improves power |
| FDR threshold | q < 0.05 | Standard multiple testing correction |
| Amplitude cutoff | Context-dependent | Often top 25th percentile of amplitudes among significant genes |
| Relative amplitude | rAMP > 0.1 | Filters low-amplitude oscillations; 10% of baseline is minimal biological relevance |
## Method Selection
| Method | Best For | Limitations |
|--------|----------|-------------|
| Cosinor | Parametric estimation, sinusoidal waveforms | Assumes sinusoidal shape |
| JTK_CYCLE | Robust non-parametric, evenly sampled | Requires even sampling |
| ARSER | Handles non-sinusoidal, spectral decomposition | Slower, needs longer series |
| RAIN | Asymmetric waveforms | Less power for symmetric waves |
| Lomb-Scargle | Uneven sampling | See periodicity-detection for unknown periods |
## Related Skills
- periodicity-detection - Unknown-period discovery with Lomb-Scargle and wavelets
- temporal-clustering - Group rhythmic genes by phase
- differential-expression/timeseries-de - Temporal DE testing
- data-visualization/heatmaps-clustering - Circular phase heatmaps
More from GPTomics/bioSkills
- bio-admet-predictionPredicts ADMET properties using ADMETlab 3.0 API or DeepChem models. Estimates bioavailability, CYP inhibition, hERG liability, and 119 toxicity endpoints with uncertainty quantification. Filters for PAINS and other structural alerts. Use when filtering compounds for drug-likeness or prioritizing leads by predicted safety.
- bio-alignment-amplicon-clippingTrim PCR primers from aligned reads in amplicon-panel BAMs using samtools ampliconclip. Use when processing SARS-CoV-2 ARTIC, hereditary cancer panels, ctDNA hot-spot panels, or any amplicon assay where primer-derived bases would falsely confirm reference at primer footprints.
- bio-alignment-filteringFilter alignments by flags, mapping quality, and regions using samtools view and pysam. Use when extracting specific reads, removing low-quality alignments, or subsetting to target regions.
- bio-alignment-indexingCreate and use BAI/CSI indices for BAM/CRAM files using samtools and pysam. Use when enabling random access to alignment files or fetching specific genomic regions.
- bio-alignment-ioRead, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.
- bio-alignment-msa-parsingParse and analyze multiple sequence alignments using Biopython. Extract sequences, identify conserved regions, analyze gaps, work with annotations, and manipulate alignment data for downstream analysis. Use when parsing or manipulating multiple sequence alignments.
- bio-alignment-msa-statisticsCalculate alignment statistics including sequence identity, conservation scores, substitution matrices, and similarity metrics. Use when comparing alignment quality, measuring sequence divergence, and analyzing evolutionary patterns.
- bio-alignment-multiplePerform multiple sequence alignment using MAFFT, MUSCLE5, ClustalOmega, or T-Coffee. Guides tool and algorithm selection based on dataset size, sequence divergence, and downstream application. Use when aligning three or more homologous sequences for phylogenetics, conservation analysis, or evolutionary studies.
- bio-alignment-pairwisePerform pairwise sequence alignment using Biopython Bio.Align.PairwiseAligner. Use when comparing two sequences, finding optimal alignments, scoring similarity, and identifying local or global matches between DNA, RNA, or protein sequences.
- bio-alignment-sortingSort alignment files by coordinate or read name using samtools and pysam. Use when preparing BAM files for indexing, variant calling, or paired-end analysis.