variant-cross-database-ids
$
npx mdskill add InternScience/scp/variant-cross-database-ids```tex Query ClinGen Allele Registry by rsID to get cross-database identifiers. Maps variant to IDs in ClinVar, gnomAD, COSMIC, UniProtKB, OMIM, etc. API: GET https://reg.genome.network/alleles?dbSNP.rs={rs_id} Headers: Accept: application/json Note: May return multiple alleles (multi-allelic sites); filter out synonymous (reference) alleles. Args: rs_id (str): dbSNP rsID (e.g. "rs7412") Return: CA ID (canonical allele), and cross-references to ClinVar (alleleId, variationId, RCVs), gnomAD, COSMIC, UniProtKB, OMIM and other databases.
SKILL.md
.github/skills/variant-cross-database-idsView on GitHub ↗
---
name: variant-cross-database-ids
description: "Query ClinGen Allele Registry to map variant rsID to identifiers in other databases (ClinVar, gnomAD, COSMIC, UniProtKB, OMIM, etc.)."
license: MIT license
metadata:
skill-author: PJLab
---
# ClinGen Allele Registry — Cross-Database ID Mapping
## Usage
### Tool Description
```tex
Query ClinGen Allele Registry by rsID to get cross-database identifiers.
Maps variant to IDs in ClinVar, gnomAD, COSMIC, UniProtKB, OMIM, etc.
API: GET https://reg.genome.network/alleles?dbSNP.rs={rs_id}
Headers: Accept: application/json
Note: May return multiple alleles (multi-allelic sites); filter out synonymous (reference) alleles.
Args:
rs_id (str): dbSNP rsID (e.g. "rs7412")
Return:
CA ID (canonical allele), and cross-references to ClinVar (alleleId, variationId, RCVs),
gnomAD, COSMIC, UniProtKB, OMIM and other databases.
Return Fields Explanation:
- CA ID: ClinGen 统一分配的等位基因标准标识符 (e.g. CA127498)
- communityStandardTitle: HGVS 标准命名 (e.g. NM_000041.2(APOE):c.526C>T (p.Arg176Cys))
- ClinVarAlleles.alleleId: ClinVar 等位基因内部编号
- ClinVarAlleles.preferredName: ClinVar 的 HGVS 标准命名(转录本:cDNA变化 + 蛋白变化)
- ClinVarVariations.variationId: ClinVar 变异条目编号 (= VCV 编号,如 17848 对应 VCV000017848)
- ClinVarVariations.RCV: 临床评估记录列表,每个 RCV 代表一个独立机构对该变异的临床解读提交
- COSMIC: COSMIC 肿瘤体细胞变异数据库 ID
- gnomAD_2/3/4: 各版本 gnomAD 中的 chr-pos-ref-alt 格式 ID
- ExAC: ExAC(旧版人群频率数据库)中的变异 ID
- MyVariantInfo_hg19/hg38: MyVariant.info API 使用的 HGVS genomic 格式
- dbSNP.rs: 对应的 dbSNP rsID 编号
```
### Query Example
```python
import requests, json
rs_id = "rs7412"
url = f"https://reg.genome.network/alleles?dbSNP.rs={rs_id}"
resp = requests.get(url, headers={"Accept": "application/json"}, timeout=30).json()
if not isinstance(resp, list):
resp = [resp]
print(f"[ClinGen] {rs_id} 对应 {len(resp)} 个等位基因")
for i, allele in enumerate(resp):
ca_id = allele.get("@id", "").split("/")[-1] # e.g. CA127498
titles = allele.get("communityStandardTitle", [])
# 跳过同义变异(参考等位基因,标题含 "=" 表示无变化)
if titles and any("=" in t for t in titles):
print(f"\n── [{i}] CA ID: {ca_id} (同义/参考等位基因,跳过)")
continue
print(f"\n── [{i}] CA ID: {ca_id} ──")
if titles:
print(f" 标准命名(HGVS): {titles}")
# 外部数据库交叉引用
ext = allele.get("externalRecords", {})
# ClinVar: alleleId = 等位基因编号, preferredName = HGVS命名
for cv in ext.get("ClinVarAlleles", []):
print(f" ClinVar Allele ID: {cv.get('alleleId')}, name: {cv.get('preferredName')}")
# ClinVar: variationId = VCV编号, RCV = 各机构临床评估记录列表
for cv in ext.get("ClinVarVariations", []):
print(f" ClinVar Variation ID: {cv.get('variationId')}, RCVs: {cv.get('RCV', [])}")
# COSMIC (肿瘤体细胞变异)
for c in ext.get("COSMIC", []):
print(f" COSMIC: {c.get('id', c)}")
# gnomAD (人群频率, chr-pos-ref-alt 格式)
for ver in ["gnomAD_2", "gnomAD_3", "gnomAD_4"]:
for g in ext.get(ver, []):
gid = g.get("id", g) if isinstance(g, dict) else g
print(f" {ver}: {gid}")
# dbSNP
for d in ext.get("dbSNP", []):
rs = d.get("rs", d) if isinstance(d, dict) else d
print(f" dbSNP: rs{rs}")
# MyVariantInfo (HGVS genomic 格式)
for ver in ["MyVariantInfo_hg19", "MyVariantInfo_hg38"]:
for m in ext.get(ver, []):
mid = m.get("id", m) if isinstance(m, dict) else m
print(f" {ver}: {mid}")
# ExAC (旧版人群频率)
for e in ext.get("ExAC", []):
eid = e.get("id", e) if isinstance(e, dict) else e
print(f" ExAC: {eid}")
```
More from InternScience/scp
- admet_druglikeness_reportADMET & Drug-Likeness Report - Generate comprehensive ADMET and drug-likeness report: molecular properties, H-bond analysis, hydrophobicity, topology, and ADMET prediction. Use this skill for medicinal chemistry tasks involving calculate mol basic info calculate mol hbond calculate mol hydrophobicity calculate mol topology pred molecule admet. Combines 5 tools from 2 SCP server(s).
- affinity_maturationAffinity Maturation Pipeline - Affinity maturation: compute binding affinity, predict mutations, compute hydrophilicity, and predict drug-target interaction. Use this skill for antibody engineering tasks involving ComputeAffinityCalculator zero shot sequence prediction ComputeHydrophilicity PredictDrugTargetInteraction. Combines 4 tools from 3 SCP server(s).
- alanine_scanning_pipelineAlanine Scanning Mutagenesis Pipeline - Alanine scanning: design scan, compute properties for each mutant, predict interactions, and compare. Use this skill for protein biochemistry tasks involving AlanineScanningDesigner ComputeProtPara PredictDrugTargetInteraction calculate protein sequence properties. Combines 4 tools from 3 SCP server(s).
- aliphatic_ring_analysisRing System Analysis - Analyze ring systems: count aliphatic carbocycles, analyze aromaticity, compute topology, and structure complexity. Use this skill for organic chemistry tasks involving GetAliphaticCarbocyclesNum AromaticityAnalyzer calculate mol topology calculate mol structure complexity. Combines 4 tools from 3 SCP server(s).
- alphafold_structure_pipelineAlphaFold Structure Analysis Pipeline - AlphaFold pipeline: download predicted structure, predict pockets, extract sequence, and compute properties. Use this skill for computational biology tasks involving download alphafold structure run fpocket extract pdb sequence calculate pdb basic info. Combines 4 tools from 3 SCP server(s).
- antibody_drug_developmentAntibody Drug Development - Develop antibody drug: target protein analysis, biotherapeutic lookup, protein properties, and interaction prediction. Use this skill for biologics tasks involving get uniprotkb entry by accession get biotherapeutic by name ComputeProtPara ComputeHydrophilicity. Combines 4 tools from 3 SCP server(s).
- antibody_target_analysisAntibody-Target Analysis - Analyze an antibody target: UniProt protein info, InterPro domains, protein properties, and biotherapeutic data from ChEMBL. Use this skill for immunology tasks involving get uniprotkb entry by accession query interpro ComputeProtPara get biotherapeutic by name. Combines 4 tools from 4 SCP server(s).
- atc_drug_classificationATC Drug Classification Lookup - Look up drug in ATC classification: ChEMBL ATC class, FDA drug info, PubChem compound, and mechanism of action. Use this skill for pharmacology tasks involving get atc class by level5 get mechanism of action by drug name get compound by name get drug by name. Combines 4 tools from 3 SCP server(s).
- atmospheric-science-calculationsCalculate atmospheric parameters including Coriolis parameter, geostrophic wind, heat index, potential temperature, and dewpoint for meteorology and climate science.
- binding_site_characterizationBinding Site Characterization - Characterize binding sites: predict pockets with fpocket and P2Rank, get binding site info from ChEMBL, and visualize. Use this skill for structural biology tasks involving run fpocket pred pocket prank get binding site by id visualize protein. Combines 4 tools from 3 SCP server(s).