Browse Skills — Page 28

21,718 public skills · showing 2,701–2,800

bio-codon-usage
GPTomics/bioSkills
Analyze codon usage, calculate CAI (Codon Adaptation Index), and examine synonymous codon bias using Biopython. Use when analyzing coding sequences for expression optimization or evolutionary analysis.
100/100
bio-comparative-genomics-ancestral-reconstruction
GPTomics/bioSkills
Reconstruct ancestral sequences at phylogenetic nodes using PAML and IQ-TREE marginal likelihood methods. Infer ancient protein sequences and trace evolutionary trajectories through sequence history. Use when inferring ancestral states for protein resurrection or tracing evolutionary history.
100/100
bio-comparative-genomics-hgt-detection
GPTomics/bioSkills
Detect horizontal gene transfer events using HGTector, compositional analysis, and phylogenetic incongruence methods. Identify foreign genes in bacterial and archaeal genomes from anomalous composition or unexpected phylogenetic placement. Use when searching for horizontally transferred genes or analyzing genome evolution in prokaryotes.
100/100
bio-comparative-genomics-ortholog-inference
GPTomics/bioSkills
Infer orthologous gene groups across species using OrthoFinder and ProteinOrtho. Identify orthologs, paralogs, and co-orthologs for comparative genomics and functional annotation transfer. Use when identifying gene orthologs across species or building orthogroups for evolutionary analysis.
100/100
bio-comparative-genomics-positive-selection
GPTomics/bioSkills
Detect positive selection using dN/dS (omega) tests with PAML codeml and HyPhy. Identify sites and branches under adaptive evolution through codon models and branch-site tests. Use when testing for adaptive evolution in gene families or identifying positively selected sites.
100/100
bio-comparative-genomics-synteny-analysis
GPTomics/bioSkills
Analyze genome collinearity and syntenic blocks using MCScanX, SyRI, and JCVI for comparative genomics. Detect conserved gene order, chromosomal rearrangements, and whole-genome duplications. Use when comparing genome structure between species or identifying conserved genomic regions.
100/100
bio-compressed-files
GPTomics/bioSkills
Read and write compressed sequence files (gzip, bzip2, BGZF) using Biopython. Use when working with .gz or .bz2 sequence files. Use BGZF for indexable compressed files.
100/100
bio-consensus-sequences
GPTomics/bioSkills
Generate consensus FASTA sequences by applying VCF variants to a reference using bcftools consensus. Use when creating sample-specific reference sequences or reconstructing haplotypes.
100/100
bio-copy-number-cnv-annotation
GPTomics/bioSkills
Annotate CNVs with genes, pathways, and clinical significance. Use when interpreting CNV calls or identifying affected genes from copy number analysis.
100/100
bio-copy-number-cnv-visualization
GPTomics/bioSkills
Visualize copy number profiles, segments, and compare across samples. Create publication-quality plots of CNV data from CNVkit, GATK, or other callers. Use when creating genome-wide CNV plots, sample heatmaps, or chromosome-level visualizations.
100/100
bio-copy-number-cnvkit-analysis
GPTomics/bioSkills
Detect copy number variants from targeted/exome sequencing using CNVkit. Supports tumor-normal pairs, tumor-only, and germline CNV calling. Use when detecting CNVs from WES or targeted panel sequencing data.
100/100
bio-copy-number-gatk-cnv
GPTomics/bioSkills
Call copy number variants using GATK best practices workflow. Supports both somatic (tumor-normal) and germline CNV detection from WGS or WES data. Use when following GATK best practices or integrating CNV calling with other GATK variant pipelines.
100/100
bio-crispr-screens-base-editing-analysis
GPTomics/bioSkills
Analyzes base editing and prime editing outcomes including editing efficiency, bystander edits, and indel frequencies. Use when quantifying CRISPR base editor results, comparing ABE vs CBE efficiency, or assessing prime editing fidelity.
100/100
bio-crispr-screens-batch-correction
GPTomics/bioSkills
Batch effect correction for CRISPR screens. Covers normalization across batches, technical replicate handling, and batch-aware analysis. Use when combining screens from multiple batches or correcting systematic technical variation.
100/100
bio-crispr-screens-crispresso-editing
GPTomics/bioSkills
CRISPResso2 for analyzing CRISPR gene editing outcomes. Quantifies indels, HDR efficiency, and generates comprehensive editing reports. Use when analyzing amplicon sequencing data from CRISPR editing experiments to assess editing efficiency.
100/100
bio-crispr-screens-hit-calling
GPTomics/bioSkills
Statistical methods for calling hits in CRISPR screens. Covers MAGeCK, BAGEL2, drugZ, and custom approaches for identifying essential and resistance genes. Use when identifying significant genes from screen count data after QC passes.
100/100
bio-crispr-screens-jacks-analysis
GPTomics/bioSkills
JACKS (Joint Analysis of CRISPR/Cas9 Knockout Screens) for modeling sgRNA efficacy and gene essentiality. Use when analyzing multiple CRISPR screens simultaneously or when accounting for variable sgRNA efficiency across experiments.
100/100
bio-crispr-screens-library-design
GPTomics/bioSkills
CRISPR library design for genetic screens. Covers sgRNA selection, library composition, control design, and oligo ordering. Use when designing custom sgRNA libraries for knockout, activation, or interference screens.
100/100
bio-crispr-screens-mageck-analysis
GPTomics/bioSkills
MAGeCK (Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout) for pooled CRISPR screen analysis. Covers count normalization, gene ranking, and pathway analysis. Use when identifying essential genes, drug targets, or resistance mechanisms from dropout or enrichment screens.
100/100
bio-crispr-screens-screen-qc
GPTomics/bioSkills
Quality control for pooled CRISPR screens. Covers library representation, read distribution, replicate correlation, and essential gene recovery. Use when assessing screen quality before hit calling or diagnosing poor screen performance.
100/100
bio-ctdna-mutation-detection
GPTomics/bioSkills
Detects somatic mutations in circulating tumor DNA using variant callers optimized for low allele fractions with UMI-based error suppression. Reliably detects mutations at VAF above 0.5 percent using consensus-based approaches. Use when identifying tumor mutations from plasma DNA or tracking specific variants.
100/100
bio-data-visualization-circos-plots
GPTomics/bioSkills
Create circular genome visualizations with Circos and pyCircos. Display multi-track data including ideograms, genes, variants, CNVs, and interaction arcs. Use when creating circular genome visualizations.
100/100
bio-data-visualization-color-palettes
GPTomics/bioSkills
Select and apply colorblind-friendly palettes for scientific figures using viridis, RColorBrewer, and custom color schemes. Use when selecting colorblind-friendly palettes for figures.
100/100
bio-data-visualization-genome-browser-tracks
GPTomics/bioSkills
Generate genome browser visualizations using pyGenomeTracks or IGV batch scripting for publication figures. Use when creating publication figures of genomic regions with multiple data tracks.
100/100
bio-data-visualization-genome-tracks
GPTomics/bioSkills
Create genome browser-style visualizations showing multiple data tracks (coverage, peaks, genes) using pyGenomeTracks, Gviz, and IGV. Use when visualizing genomic data at specific loci with multiple aligned tracks.
100/100
bio-data-visualization-ggplot2-fundamentals
GPTomics/bioSkills
Create publication-quality scientific figures with ggplot2 including scatter plots, boxplots, heatmaps, and multi-panel layouts. Use when creating static figures for papers, presentations, or reports in R.
100/100
bio-data-visualization-heatmaps-clustering
GPTomics/bioSkills
Create clustered heatmaps with row/column annotations using ComplexHeatmap, pheatmap, and seaborn for gene expression and omics data visualization. Use when visualizing expression patterns across samples or identifying co-expressed gene clusters.
100/100
bio-data-visualization-interactive-visualization
GPTomics/bioSkills
Create interactive HTML plots with plotly and bokeh for exploratory data analysis and web-based sharing of omics visualizations. Use when building zoomable, hoverable plots for data exploration or web dashboards.
100/100
bio-data-visualization-multipanel-figures
GPTomics/bioSkills
Combine multiple plots into publication-ready multi-panel figures using patchwork, cowplot, or matplotlib GridSpec with shared legends and panel labels. Use when combining multiple plots into publication figures.
100/100
bio-data-visualization-network-visualization
GPTomics/bioSkills
Visualize biological networks including gene regulatory networks, protein interaction networks, and co-expression modules using NetworkX, PyVis, and Cytoscape automation. Produces interactive and publication-quality network figures. Use when creating network diagrams from interaction data, GRN results, or co-expression modules.
100/100
bio-data-visualization-specialized-omics-plots
GPTomics/bioSkills
Reusable plotting functions for common omics visualizations. Custom ggplot2/matplotlib implementations of volcano, MA, PCA, enrichment dotplots, boxplots, and survival curves. Use when creating volcano, MA, or enrichment plots.
100/100
bio-data-visualization-upset-plots
GPTomics/bioSkills
Create UpSet plots to visualize set intersections as an alternative to Venn diagrams using UpSetR or upsetplot. Use when comparing overlapping gene sets, peak sets, or sample groups with more than 3 sets.
100/100
bio-data-visualization-volcano-customization
GPTomics/bioSkills
Create publication-ready volcano plots with custom thresholds, gene labels, and highlighting using ggplot2, EnhancedVolcano, or matplotlib. Use when visualizing differential expression or association results with gene annotations.
100/100
bio-de-deseq2-basics
GPTomics/bioSkills
Perform differential expression analysis using DESeq2 in R/Bioconductor. Use for analyzing RNA-seq count data, creating DESeqDataSet objects, running the DESeq workflow, and extracting results with log fold change shrinkage. Use when performing DE analysis with DESeq2.
100/100
bio-de-edger-basics
GPTomics/bioSkills
Perform differential expression analysis using edgeR in R/Bioconductor. Use for analyzing RNA-seq count data with the quasi-likelihood F-test framework, creating DGEList objects, normalization, dispersion estimation, and statistical testing. Use when performing DE analysis with edgeR.
100/100
bio-de-results
GPTomics/bioSkills
Extract, filter, annotate, and export differential expression results from DESeq2 or edgeR. Use for identifying significant genes, applying multiple testing corrections, adding gene annotations, and preparing results for downstream analysis. Use when filtering and exporting DE analysis results.
100/100
bio-de-visualization
GPTomics/bioSkills
Visualize differential expression results using DESeq2/edgeR built-in functions. Covers plotMA, plotDispEsts, plotCounts, plotBCV, sample distance heatmaps, and p-value histograms. Use when visualizing differential expression results.
100/100
bio-differential-expression-timeseries-de
GPTomics/bioSkills
Analyze time-series RNA-seq data using limma voom with splines, maSigPro, and ImpulseDE2. Identify genes with dynamic expression patterns. Use when analyzing time-series or longitudinal expression data.
100/100
bio-differential-splicing
GPTomics/bioSkills
Detects differential alternative splicing between conditions using rMATS-turbo (binomial LRT on junction counts), leafcutter (Dirichlet-multinomial GLM on intron clusters), MAJIQ V3 deltapsi/HET (Bayesian posterior on LSVs), SUPPA2 (empirical-null on TPM-derived PSI), or Shiba (junction-imbalance-corrected, 2025 SOTA at low coverage). Reports FDR-corrected significance and delta PSI effect sizes. Tools differ in statistical model, annotation dependence, calibration regime, and replicate-count requirements. Use when comparing splicing patterns between treatment groups, tissues, or disease states.
100/100
bio-duplicate-handling
GPTomics/bioSkills
Mark and remove PCR/optical duplicates using samtools fixmate and markdup. Use when preparing alignments for variant calling or when duplicate reads would bias analysis.
100/100
bio-ecological-genomics-biodiversity-metrics
GPTomics/bioSkills
Calculates species richness, diversity, and turnover using the Hill number framework with iNEXT coverage-based rarefaction/extrapolation, asymptotic diversity estimation, and beta diversity partitioning (betapart turnover vs nestedness). Compares assemblages using coverage-standardized rather than size-standardized rarefaction. Use when quantifying biodiversity from species abundance or incidence data, comparing diversity across sites, or constructing rarefaction curves. Not for clinical 16S microbiome alpha/beta diversity (see microbiome/diversity-analysis).
100/100
bio-ecological-genomics-community-ecology
GPTomics/bioSkills
Analyzes community composition using constrained ordination (CCA, RDA, db-RDA), variance partitioning (varpart), indicator species analysis (indicspecies multipatt), and distance-based environmental gradient methods with vegan. Links species composition to environmental explanatory variables. Use when testing how environmental gradients structure species communities, identifying habitat indicator taxa, or partitioning explained variation among predictors. Not for basic unconstrained ordination and PERMANOVA (see microbiome/diversity-analysis).
100/100
bio-ecological-genomics-conservation-genetics
GPTomics/bioSkills
Assesses genetic health of populations for conservation using effective population size estimation (GONE2 for recent Ne trajectory, NeEstimator for contemporary Ne, Stairway Plot 2 and PSMC for historical Ne), F-statistics (hierfstat), runs of homozygosity (detectRUNS), and genetic diversity metrics. Use when estimating effective population size, detecting inbreeding or bottlenecks, or assessing genetic diversity in threatened species from microsatellite or SNP data.
100/100
bio-ecological-genomics-edna-metabarcoding
GPTomics/bioSkills
Processes environmental DNA metabarcoding data from raw amplicon reads to species occurrence tables using OBITools3, DADA2, and taxonomic assignment against BOLD, MIDORI2, or MitoFish databases. Handles COI, 12S, rbcL, and ITS barcode regions with primer removal, denoising, chimera detection, and contamination filtering via decontam. Includes occupancy modeling (occumb) for detection probability correction. Use when analyzing eDNA from water, soil, or bulk samples for biodiversity monitoring. Not for 16S human microbiome (see microbiome/amplicon-processing).
95/100
bio-ecological-genomics-landscape-genomics
GPTomics/bioSkills
Tests genotype-environment associations and identifies loci under local adaptation using LFMM2 (LEA), pcadapt outlier detection, OutFLANK Fst-based selection scans, and redundancy analysis. Detects adaptive genetic variation correlated with environmental variables while controlling for population structure. Use when identifying adaptive loci across environmental gradients, testing for signatures of local adaptation, or predicting genetic vulnerability to climate change with gradientForest.
100/100
bio-ecological-genomics-species-delimitation
GPTomics/bioSkills
Delimits species boundaries from molecular data using distance-based (ASAP), tree-based (bPTP, GMYC), and coalescent (BPP) methods. Compares multiple delimitation results with delimtools. Use when delineating putative species from DNA barcoding data, resolving cryptic species complexes, or validating taxonomic assignments. Emphasizes multi-method consensus following integrative taxonomy best practice.
100/100
bio-entrez-fetch
GPTomics/bioSkills
Retrieve records from NCBI databases using Biopython Bio.Entrez. Use when downloading sequences, fetching GenBank records, getting document summaries, or parsing NCBI data into Biopython objects.
100/100
bio-entrez-link
GPTomics/bioSkills
Find cross-references between NCBI databases using Biopython Bio.Entrez. Use when navigating from genes to proteins, sequences to publications, finding related records, or discovering database relationships.
100/100
bio-entrez-search
GPTomics/bioSkills
Search NCBI databases using Biopython Bio.Entrez. Use when finding records by keyword, building complex search queries, discovering database structure, or getting global query counts across databases.
100/100
bio-epidemiological-genomics-amr-surveillance
GPTomics/bioSkills
Detect and track antimicrobial resistance genes using AMRFinderPlus and ResFinder with epidemiological context. Monitor resistance trends and identify emerging resistance patterns. Use when screening genomes for AMR genes or tracking resistance in surveillance programs.
100/100
bio-epidemiological-genomics-pathogen-typing
GPTomics/bioSkills
Perform multi-locus sequence typing (MLST), core genome MLST, and SNP-based strain typing for bacterial isolate characterization using mlst and chewBBACA. Use when identifying strain types, tracking outbreak clones, or characterizing bacterial isolates.
100/100
bio-epidemiological-genomics-phylodynamics
GPTomics/bioSkills
Construct time-scaled phylogenies and infer evolutionary dynamics using TreeTime and BEAST2 for outbreak analysis. Estimate divergence times, molecular clock rates, and ancestral states. Use when dating outbreak origins, estimating transmission rates, or building time-calibrated trees.
100/100
bio-epidemiological-genomics-transmission-inference
GPTomics/bioSkills
Infer pathogen transmission networks and identify likely transmission pairs using TransPhylo and outbreak reconstruction algorithms. Estimate who-infected-whom from genomic and epidemiological data. Use when investigating outbreak transmission chains or identifying superspreaders.
100/100
bio-epidemiological-genomics-variant-surveillance
GPTomics/bioSkills
Assign pathogen lineages and track variants using Nextclade and pangolin for viral surveillance. Monitor variant prevalence and identify emerging variants of concern. Use when classifying viral sequences, tracking lineage dynamics, or monitoring for variants of concern.
100/100
bio-epitranscriptomics-m6a-differential
GPTomics/bioSkills
Identify differential m6A methylation between conditions from MeRIP-seq. Use when comparing epitranscriptomic changes between treatment groups or cell states.
100/100
bio-epitranscriptomics-m6a-peak-calling
GPTomics/bioSkills
Call m6A peaks from MeRIP-seq IP vs input comparisons. Use when identifying m6A modification sites from methylated RNA immunoprecipitation data.
100/100
bio-epitranscriptomics-m6anet-analysis
GPTomics/bioSkills
Detect m6A modifications from Oxford Nanopore direct RNA sequencing using m6Anet. Use when analyzing epitranscriptomic modifications from long-read RNA data without immunoprecipitation.
100/100
bio-epitranscriptomics-merip-preprocessing
GPTomics/bioSkills
Align and QC MeRIP-seq IP and input samples for m6A analysis. Use when preparing MeRIP-seq data for peak calling or differential methylation analysis.
100/100
bio-epitranscriptomics-modification-visualization
GPTomics/bioSkills
Create metagene plots and browser tracks for RNA modification data. Use when visualizing m6A distribution patterns around genomic features like stop codons.
100/100
bio-experimental-design-batch-design
GPTomics/bioSkills
Designs experiments to minimize and account for batch effects using balanced layouts and blocking strategies. Use when planning multi-batch experiments, assigning samples to sequencing lanes, or designing studies where technical variation could confound biological signals.
100/100
bio-experimental-design-multiple-testing
GPTomics/bioSkills
Applies multiple testing correction methods including FDR, Bonferroni, and q-value for genomics data. Use when filtering differential expression results, setting significance thresholds, or choosing between correction methods for different study designs.
100/100
bio-experimental-design-power-analysis
GPTomics/bioSkills
Calculates statistical power and minimum sample sizes for RNA-seq, ATAC-seq, and other sequencing experiments. Use when planning experiments, determining how many replicates are needed, or assessing whether a study is adequately powered to detect expected effect sizes.
100/100
bio-experimental-design-sample-size
GPTomics/bioSkills
Estimates required sample sizes for differential expression, ChIP-seq, methylation, and proteomics studies. Use when budgeting experiments, writing grant proposals, or determining minimum replicates needed to achieve statistical significance for expected effect sizes.
100/100
bio-expression-matrix-counts-ingest
GPTomics/bioSkills
Load gene expression count matrices from various formats including CSV, TSV, featureCounts, Salmon, kallisto, and 10X. Use when importing quantification results for downstream analysis.
100/100
bio-expression-matrix-gene-id-mapping
GPTomics/bioSkills
Convert between gene identifier systems including Ensembl, Entrez, HGNC symbols, and UniProt. Use when mapping IDs for pathway analysis or matching different data sources.
100/100
bio-expression-matrix-metadata-joins
GPTomics/bioSkills
Merge sample metadata with count matrices and add gene annotations. Use when preparing data for differential expression analysis or visualization.
100/100
bio-expression-matrix-normalization
GPTomics/bioSkills
Normalize and transform RNA-seq count matrices for differential expression, visualization, and clustering. Covers between-sample (TMM, RLE, upper quartile), within-sample (TPM, FPKM), variance-stabilizing (VST, rlog), and single-cell (scran) methods. Use when choosing or applying normalization to expression data.
100/100
bio-expression-matrix-sparse-handling
GPTomics/bioSkills
Work with sparse matrices for memory-efficient storage of count data. Use when dealing with single-cell data or large bulk RNA-seq datasets where most values are zero.
100/100
bio-fastq-quality
GPTomics/bioSkills
Work with FASTQ quality scores using Biopython. Use when analyzing read quality, filtering by quality, trimming low-quality bases, or generating quality reports.
100/100
bio-filter-sequences
GPTomics/bioSkills
Filter and select sequences by criteria (length, ID, GC content, patterns) using Biopython. Use when subsetting sequences, removing unwanted records, or selecting by specific criteria.
100/100
bio-flow-cytometry-bead-normalization
GPTomics/bioSkills
Bead-based normalization for CyTOF and high-parameter flow cytometry. Covers EQ bead normalization, signal drift correction, and batch normalization. Use when correcting instrument drift in CyTOF or harmonizing data across batches.
100/100
bio-flow-cytometry-clustering-phenotyping
GPTomics/bioSkills
Unsupervised clustering and cell type identification for flow/mass cytometry. Covers FlowSOM, Phenograph, and CATALYST workflows. Use when discovering cell populations in high-dimensional cytometry data without predefined gates.
100/100
bio-flow-cytometry-compensation-transformation
GPTomics/bioSkills
Spillover compensation and data transformation for flow cytometry. Covers compensation matrix calculation, application, and biexponential/arcsinh transforms. Use when correcting spectral overlap between fluorophores or transforming data for analysis.
100/100
bio-flow-cytometry-cytometry-qc
GPTomics/bioSkills
Comprehensive quality control for flow cytometry and CyTOF data. Covers flow rate stability, signal drift, margin events, dead cell exclusion, and batch QC. Use when assessing acquisition quality or identifying problematic samples before analysis.
100/100
bio-flow-cytometry-differential-analysis
GPTomics/bioSkills
Differential abundance and state analysis for cytometry data. Compare cell populations between conditions using statistical methods. Use when testing for significant changes in cell frequencies or marker expression between groups.
100/100
bio-flow-cytometry-doublet-detection
GPTomics/bioSkills
Detect and remove doublets from flow and mass cytometry data. Covers FSC/SSC gating and computational doublet detection methods. Use when filtering out cell aggregates before clustering or quantitative analysis.
100/100
bio-flow-cytometry-fcs-handling
GPTomics/bioSkills
Read and manipulate Flow Cytometry Standard (FCS) files. Covers loading data, accessing parameters, and basic data exploration. Use when loading and inspecting flow or mass cytometry data before preprocessing.
100/100
bio-flow-cytometry-gating-analysis
GPTomics/bioSkills
Manual and automated gating for defining cell populations in flow cytometry. Covers rectangular, polygon, and data-driven gates. Use when identifying cell populations through hierarchical gating strategies.
100/100
bio-format-conversion
GPTomics/bioSkills
Convert between sequence file formats (FASTA, FASTQ, GenBank, EMBL) using Biopython Bio.SeqIO. Use when changing file formats or preparing data for different tools.
100/100
bio-fragment-analysis
GPTomics/bioSkills
Analyzes cfDNA fragment size distributions and fragmentomics features using FinaleToolkit or Griffin. Extracts nucleosome positioning patterns, fragment ratios, and DELFI-style fragmentation profiles for cancer detection. Use when leveraging fragment patterns for tumor detection or tissue-of-origin analysis.
100/100
bio-gatk-variant-calling
GPTomics/bioSkills
Variant calling with GATK HaplotypeCaller following best practices. Covers germline SNP/indel calling, GVCF workflow for cohorts, joint genotyping, and variant quality score recalibration (VQSR). Use when calling variants with GATK HaplotypeCaller.
100/100
bio-gene-regulatory-networks-coexpression-networks
GPTomics/bioSkills
Build weighted gene co-expression networks to identify modules of co-regulated genes and relate them to phenotypes using WGCNA and CEMiTool. Detects hub genes and module-trait relationships from bulk or single-cell expression data. Use when finding co-expression modules, identifying hub genes, or relating gene networks to clinical or experimental variables.
100/100
bio-gene-regulatory-networks-differential-networks
GPTomics/bioSkills
Compare gene regulatory and co-expression networks between biological conditions to identify rewired regulatory relationships using DiffCorr. Detects gained, lost, and reversed gene-gene correlations between conditions. Use when comparing co-expression networks between disease vs control, treatment conditions, or developmental stages.
100/100
bio-gene-regulatory-networks-multiomics-grn
GPTomics/bioSkills
Build enhancer-driven gene regulatory networks by integrating single-cell RNA-seq and ATAC-seq data using SCENIC+ to identify eRegulons linking transcription factors to enhancers and target genes. Use when analyzing 10x multiome or paired scRNA+scATAC data to infer cis-regulatory GRNs.
100/100
bio-gene-regulatory-networks-perturbation-simulation
GPTomics/bioSkills
Simulate transcription factor perturbation effects on cell state using CellOracle, which integrates GRN inference with in silico knockout and overexpression modeling. Predicts cell identity shifts and differentiation trajectory changes from TF perturbations. Use when predicting the effect of transcription factor knockouts, planning perturbation experiments, or identifying driver TFs for cell fate transitions.
100/100
bio-gene-regulatory-networks-scenic-regulons
GPTomics/bioSkills
Infer gene regulatory networks and identify transcription factor regulons from single-cell RNA-seq data using pySCENIC. Discovers co-expression modules with GRNBoost2, prunes by cis-regulatory motif enrichment, and scores regulon activity per cell with AUCell. Use when identifying transcription factor regulons, scoring TF activity in single cells, or finding master regulators of cell identity.
90/100
bio-genome-annotation-annotation-transfer
GPTomics/bioSkills
Transfer gene annotations between genome assemblies using Liftoff for same-species annotation liftover and MiniProt for cross-species protein-to-genome alignment. Enables rapid annotation of new assemblies using existing reference annotations. Use when annotating a new assembly of a species with an existing reference annotation or mapping annotations across related species.
100/100
bio-genome-annotation-eukaryotic-gene-prediction
GPTomics/bioSkills
Predict protein-coding genes in eukaryotic genomes using BRAKER3 for combined RNA-seq and protein evidence, or GALBA for protein-only evidence. Runs Augustus with trained parameters for accurate gene models. Use when annotating a newly assembled eukaryotic genome or improving existing gene predictions.
95/100
bio-genome-annotation-functional-annotation
GPTomics/bioSkills
Assign GO terms, KEGG orthologs, Pfam domains, and EC numbers to predicted proteins using eggNOG-mapper and InterProScan. Produces functional summaries for downstream pathway and enrichment analysis. Use when adding functional annotation to predicted genes or characterizing protein functions in a new genome.
100/100
bio-genome-annotation-ncrna-annotation
GPTomics/bioSkills
Identify non-coding RNAs including tRNAs, rRNAs, snoRNAs, and regulatory RNAs using Infernal covariance model searches against Rfam and tRNAscan-SE for tRNA prediction. Use when performing genome-wide ncRNA annotation with assembly input producing GFF output.
90/100
bio-genome-annotation-prokaryotic-annotation
GPTomics/bioSkills
Annotate bacterial and archaeal genomes with Bakta for comprehensive structural and functional annotation, or Prokka for lightweight annotation. Generates GFF3, GenBank, and FASTA outputs with NCBI-compatible locus tags. Use when annotating a newly assembled prokaryotic genome or preparing annotations for NCBI submission.
100/100
bio-genome-annotation-repeat-annotation
GPTomics/bioSkills
Identify and classify repetitive elements and transposable elements using RepeatModeler for de novo repeat library construction and RepeatMasker for genome-wide repeat annotation. Quantify TE expression from RNA-seq with TEtranscripts. Use when masking repeats before gene prediction or analyzing transposable element activity.
100/100
bio-genome-assembly-assembly-polishing
GPTomics/bioSkills
Polish genome assemblies to reduce errors using short reads (Pilon), long reads (Racon), or ONT-specific tools (medaka). Essential for improving long-read assembly accuracy. Use when improving assembly accuracy with polishing tools.
100/100
bio-genome-assembly-assembly-qc
GPTomics/bioSkills
Assess genome assembly quality using QUAST for contiguity metrics and BUSCO for completeness. Essential for evaluating assembly success and comparing assemblers. Use when evaluating assembly completeness and quality.
100/100
bio-genome-assembly-contamination-detection
GPTomics/bioSkills
Detect contamination and assess genome quality using CheckM, CheckM2, GTDB-Tk, and GUNC for metagenome-assembled genomes and isolate assemblies. Use when checking assemblies for contamination.
100/100
bio-genome-assembly-hifi-assembly
GPTomics/bioSkills
High-quality genome assembly from PacBio HiFi reads using hifiasm with phasing support. Use when building reference-quality diploid assemblies from HiFi data, especially with trio or Hi-C phasing for fully resolved haplotypes.
100/100
bio-genome-assembly-long-read-assembly
GPTomics/bioSkills
De novo genome assembly from Oxford Nanopore or PacBio long reads using Flye and Canu. Produces highly contiguous assemblies suitable for complete bacterial genomes and resolving complex regions. Use when assembling genomes from ONT or PacBio reads.
100/100
bio-genome-assembly-metagenome-assembly
GPTomics/bioSkills
Metagenome assembly from long reads using metaFlye and metaSPAdes with binning strategies. Use when reconstructing genomes from microbial communities, recovering metagenome-assembled genomes (MAGs), or resolving strain-level variation in complex samples.
100/100
bio-genome-assembly-scaffolding
GPTomics/bioSkills
Scaffold contigs into chromosome-level assemblies using Hi-C data with YaHS, 3D-DNA, SALSA2, and validate with BUSCO and contact maps. Use when scaffolding contigs to chromosome-level assemblies.
100/100
bio-genome-assembly-short-read-assembly
GPTomics/bioSkills
De novo genome assembly from Illumina short reads using SPAdes. Covers bacterial, fungal, and small eukaryotic genome assembly, as well as metagenome and transcriptome assembly modes. Use when assembling genomes from Illumina reads.
100/100

Page 28 of 218