bio-clinical-databases-pharmacogenomics
$
npx mdskill add GPTomics/bioSkills/bio-clinical-databases-pharmacogenomicsQuery PharmGKB and CPIC for drug-gene interactions and dosing guidelines.
- Enables agents to predict drug response from genetic variants.
- Integrates with PharmGKB REST API and CPIC guidelines.
- Executes HTTP requests to retrieve clinical annotations.
- Returns structured JSON data for pharmacogenomic analysis.
SKILL.md
.github/skills/bio-clinical-databases-pharmacogenomicsView on GitHub ↗
---
name: bio-clinical-databases-pharmacogenomics
description: Query PharmGKB and CPIC for drug-gene interactions, pharmacogenomic annotations, and dosing guidelines. Use when predicting drug response from genetic variants or implementing clinical pharmacogenomics.
tool_type: python
primary_tool: requests
---
## Version Compatibility
Reference examples tested with: pandas 2.2+
Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# Pharmacogenomics
## PharmGKB REST API
**Goal:** Retrieve drug-gene clinical annotations and dosing guidelines from PharmGKB.
**Approach:** Query PharmGKB REST endpoints by gene symbol or drug name and parse JSON annotation records.
**"Find pharmacogenomic annotations for this gene"** → Query PharmGKB for clinical annotations linking genes to drug response.
- Python: `requests.get()` against PharmGKB API (requests)
### Query Drug-Gene Relationships
```python
import requests
def get_pharmgkb_annotations(gene_symbol):
'''Get PharmGKB clinical annotations for a gene'''
url = f'https://api.pharmgkb.org/v1/data/clinicalAnnotation'
params = {'view': 'base', 'location.genes.symbol': gene_symbol}
response = requests.get(url, params=params)
return response.json()['data']
annotations = get_pharmgkb_annotations('CYP2D6')
for ann in annotations[:5]:
print(f"{ann['location']['genes'][0]['symbol']}: {ann['chemicals'][0]['name']}")
```
### Query by Drug
```python
def get_drug_annotations(drug_name):
'''Get pharmacogenomic annotations for a drug'''
url = 'https://api.pharmgkb.org/v1/data/clinicalAnnotation'
params = {'view': 'base', 'chemicals.name': drug_name}
response = requests.get(url, params=params)
return response.json()['data']
warfarin_annotations = get_drug_annotations('warfarin')
```
### Get Dosing Guidelines
```python
def get_cpic_guidelines(gene_symbol):
'''Get CPIC dosing guidelines for a gene'''
url = 'https://api.pharmgkb.org/v1/data/guideline'
params = {'view': 'base', 'relatedGenes.symbol': gene_symbol, 'source': 'CPIC'}
response = requests.get(url, params=params)
return response.json()['data']
guidelines = get_cpic_guidelines('CYP2C19')
for g in guidelines:
print(f"{g['name']}: {g['chemicals'][0]['name']}")
```
## Star Allele Interpretation
**Goal:** Determine metabolizer phenotype from CYP star allele diplotypes using CPIC activity scores.
**Approach:** Sum per-allele activity scores and classify into PM/IM/NM/UM categories based on CPIC thresholds.
### CYP2D6 Metabolizer Status
```python
# CYP2D6 activity scores for common alleles
# Based on CPIC guidelines
CYP2D6_ACTIVITY = {
'*1': 1.0, # Normal function
'*2': 1.0, # Normal function
'*3': 0.0, # No function
'*4': 0.0, # No function
'*5': 0.0, # Gene deletion
'*6': 0.0, # No function
'*9': 0.5, # Decreased function
'*10': 0.25, # Decreased function (common in East Asian)
'*17': 0.5, # Decreased function
'*41': 0.5, # Decreased function
}
def calculate_activity_score(allele1, allele2):
'''Calculate CYP2D6 activity score from diplotype'''
score1 = CYP2D6_ACTIVITY.get(allele1, 1.0)
score2 = CYP2D6_ACTIVITY.get(allele2, 1.0)
return score1 + score2
def get_metabolizer_status(activity_score):
'''Convert activity score to metabolizer phenotype
CPIC thresholds:
- PM: 0
- IM: 0 < score <= 1.25
- NM: 1.25 < score <= 2.25
- UM: > 2.25 (gene duplications)
'''
if activity_score == 0:
return 'Poor Metabolizer (PM)'
elif activity_score <= 1.25:
return 'Intermediate Metabolizer (IM)'
elif activity_score <= 2.25:
return 'Normal Metabolizer (NM)'
else:
return 'Ultrarapid Metabolizer (UM)'
score = calculate_activity_score('*1', '*4')
status = get_metabolizer_status(score)
print(f'Activity score: {score}, Status: {status}')
```
### CYP2C19 Interpretation
```python
CYP2C19_ACTIVITY = {
'*1': 1.0, # Normal function
'*2': 0.0, # No function (most common loss-of-function)
'*3': 0.0, # No function
'*17': 1.5, # Increased function
}
def cyp2c19_phenotype(allele1, allele2):
'''Determine CYP2C19 metabolizer status'''
score = CYP2C19_ACTIVITY.get(allele1, 1.0) + CYP2C19_ACTIVITY.get(allele2, 1.0)
if score == 0:
return 'Poor Metabolizer'
elif score < 1.5:
return 'Intermediate Metabolizer'
elif score <= 2.0:
return 'Normal Metabolizer'
elif score <= 2.5:
return 'Rapid Metabolizer'
else:
return 'Ultrarapid Metabolizer'
```
## Drug Interaction Lookup
**Goal:** Check whether a specific drug-gene-variant combination has a known pharmacogenomic interaction.
**Approach:** Query PharmGKB variant annotation endpoint filtered by drug and gene, then match to the target variant.
```python
def check_pgx_interaction(drug, gene, variant):
'''Check for pharmacogenomic drug-gene-variant interaction'''
url = 'https://api.pharmgkb.org/v1/data/variantAnnotation'
params = {
'chemicals.name': drug,
'location.genes.symbol': gene
}
response = requests.get(url, params=params)
annotations = response.json().get('data', [])
for ann in annotations:
if variant in str(ann.get('variant', {}).get('name', '')):
return {
'drug': drug,
'gene': gene,
'variant': variant,
'phenotype': ann.get('phenotypes', []),
'evidence': ann.get('evidenceLevel')
}
return None
```
## Common PGx Gene-Drug Pairs
| Gene | Drugs | Clinical Impact |
|------|-------|-----------------|
| CYP2D6 | Codeine, tamoxifen, ondansetron | Efficacy, toxicity |
| CYP2C19 | Clopidogrel, omeprazole, escitalopram | Efficacy, dosing |
| CYP2C9 | Warfarin, phenytoin, NSAIDs | Bleeding risk, dosing |
| VKORC1 | Warfarin | Dosing |
| TPMT | Azathioprine, mercaptopurine | Myelosuppression |
| DPYD | Fluorouracil, capecitabine | Severe toxicity |
| HLA-B*57:01 | Abacavir | Hypersensitivity |
| HLA-B*15:02 | Carbamazepine | SJS/TEN |
| SLCO1B1 | Simvastatin | Myopathy risk |
| UGT1A1 | Irinotecan | Neutropenia |
## Batch PGx Annotation
**Goal:** Annotate a cohort of variants across multiple pharmacogenes with drug interaction data.
**Approach:** Iterate over a list of pharmacogenes, fetch PharmGKB annotations for each, and collect results into a DataFrame.
```python
import pandas as pd
def annotate_pgx_variants(vcf_variants, pgx_genes):
'''Annotate variants in pharmacogenes
Args:
vcf_variants: DataFrame with chrom, pos, ref, alt
pgx_genes: List of pharmacogenes to check
'''
results = []
for gene in pgx_genes:
annotations = get_pharmgkb_annotations(gene)
for ann in annotations:
results.append({
'gene': gene,
'drug': ann['chemicals'][0]['name'] if ann.get('chemicals') else None,
'phenotype': ann.get('phenotypes', []),
'level': ann.get('levelOfEvidence')
})
return pd.DataFrame(results)
pgx_genes = ['CYP2D6', 'CYP2C19', 'CYP2C9', 'VKORC1', 'TPMT']
pgx_df = annotate_pgx_variants(vcf_df, pgx_genes)
```
## Related Skills
- clinical-databases/clinvar-lookup - Pathogenicity classification
- variant-calling/clinical-interpretation - ACMG guidelines
- clinical-databases/variant-prioritization - Clinical filtering
More from GPTomics/bioSkills
- bio-admet-predictionPredicts ADMET properties using ADMETlab 3.0 API or DeepChem models. Estimates bioavailability, CYP inhibition, hERG liability, and 119 toxicity endpoints with uncertainty quantification. Filters for PAINS and other structural alerts. Use when filtering compounds for drug-likeness or prioritizing leads by predicted safety.
- bio-alignment-amplicon-clippingTrim PCR primers from aligned reads in amplicon-panel BAMs using samtools ampliconclip. Use when processing SARS-CoV-2 ARTIC, hereditary cancer panels, ctDNA hot-spot panels, or any amplicon assay where primer-derived bases would falsely confirm reference at primer footprints.
- bio-alignment-filteringFilter alignments by flags, mapping quality, and regions using samtools view and pysam. Use when extracting specific reads, removing low-quality alignments, or subsetting to target regions.
- bio-alignment-indexingCreate and use BAI/CSI indices for BAM/CRAM files using samtools and pysam. Use when enabling random access to alignment files or fetching specific genomic regions.
- bio-alignment-ioRead, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.
- bio-alignment-msa-parsingParse and analyze multiple sequence alignments using Biopython. Extract sequences, identify conserved regions, analyze gaps, work with annotations, and manipulate alignment data for downstream analysis. Use when parsing or manipulating multiple sequence alignments.
- bio-alignment-msa-statisticsCalculate alignment statistics including sequence identity, conservation scores, substitution matrices, and similarity metrics. Use when comparing alignment quality, measuring sequence divergence, and analyzing evolutionary patterns.
- bio-alignment-multiplePerform multiple sequence alignment using MAFFT, MUSCLE5, ClustalOmega, or T-Coffee. Guides tool and algorithm selection based on dataset size, sequence divergence, and downstream application. Use when aligning three or more homologous sequences for phylogenetics, conservation analysis, or evolutionary studies.
- bio-alignment-pairwisePerform pairwise sequence alignment using Biopython Bio.Align.PairwiseAligner. Use when comparing two sequences, finding optimal alignments, scoring similarity, and identifying local or global matches between DNA, RNA, or protein sequences.
- bio-alignment-sortingSort alignment files by coordinate or read name using samtools and pysam. Use when preparing BAM files for indexing, variant calling, or paired-end analysis.