ngs-dna-germline-variants

Name: ngs-dna-germline-variants
Author: openai/plugins

$npx mdskill add openai/plugins/ngs-dna-germline-variants

Use this skill for germline WGS, WES, or inherited-disease panel analysis from FASTQ, BAM, or CRAM. If the request is tumor-only, tumor-normal, or low-frequency molecular-barcode panel calling, use a somatic or UMI-panel skill instead.

SKILL.md

.github/skills/ngs-dna-germline-variantsView on GitHub ↗

---
name: ngs-dna-germline-variants
description: Run or plan deep germline WGS, WES, targeted-panel, cohort, or trio variant-calling workflows with reference-build, known-sites, QC, joint-calling, and annotation checks.
---

# Germline DNA Variants

Use this skill for germline WGS, WES, or inherited-disease panel analysis from FASTQ, BAM, or CRAM. If the request is tumor-only, tumor-normal, or low-frequency molecular-barcode panel calling, use a somatic or UMI-panel skill instead.

## Essential Inputs

Confirm:

- data type: WGS, WES, or targeted panel
- sample model: singleton, cohort, duo, trio, family, or case/control
- input type: FASTQ, BAM, or CRAM
- organism, reference build, FASTA, indexes, and contig naming
- known-sites resources for BQSR, contamination, and annotation
- target BED and bait BED for WES/panel data
- sex/ploidy assumptions and mitochondrial/sex-chromosome requirements
- desired callers, annotation outputs, and final VCF/gVCF expectations

## Route

Prefer `nf-core/sarek` for full FASTQ/BAM-to-VCF workflows. Use direct GATK4, DeepVariant, samtools, or bcftools only for focused tasks or a custom workflow.

Preflight command:

```bash
python plugins/ngs-analysis/scripts/ngs_preflight.py --pipeline dna_germline_variants --emit-install-plan
```

For compact local checks from prepared BAM/CRAM files, use the shared DNA execution package:

```bash
python plugins/ngs-analysis/scripts/run_dna_variant_calling.py \
  --sample-sheet dna_samples.tsv \
  --reference-fasta reference.fa \
  --execute
```

Treat this as a focused samtools/bcftools run envelope, not as a substitute for full cohort, trio, gVCF, BQSR, or annotation workflows.

For a higher-fidelity local germline run that owns BQSR, per-sample gVCFs, and joint genotyping assumptions, use the germline-specific runner:

```bash
python plugins/ngs-analysis/scripts/run_dna_germline_variants.py \
  --sample-sheet dna_samples.tsv \
  --reference-fasta reference.fa \
  --known-sites dbsnp.vcf.gz \
  --known-sites mills.vcf.gz \
  --emit-gvcf \
  --joint-call \
  --execute
```

This runner still expects reference-matched resources and an available GATK toolchain. It packages the validation state and generated artifacts even when execution is blocked by missing tools or resources.

It also writes advisory `resources/resource_plan.json`, `resource_manifest.tsv`, `resource_env.sh`, and `resource_readiness.md` artifacts by default. Add `--genome-build`, `--bundle-root <bundle>=<path>`, and `--require-resource-plan` when complete registered reference and known-sites bundles should be mandatory for readiness.

## Decision Points

- For cohorts or families, decide whether the endpoint is per-sample VCFs, gVCFs for joint genotyping, or a jointly called cohort VCF.
- For WES/panels, carry the target BED through alignment metrics, calling, and coverage reports; do not call off-target regions by accident.
- Use BQSR only when reference-matched known-sites resources exist. Do not mix GRCh37, hg19, GRCh38, or T2T resources.
- Check sample identity, sex concordance, contamination, coverage, duplication, insert size, and transition/transversion where feasible.
- For trios, preserve pedigree metadata and report Mendelian/QC checks separately from variant interpretation.

## Outputs

Produce:

- command or workflow profile and sample sheet
- reference/resource manifest with versions and checksums when available
- QC summary: coverage, duplication, insert size, contamination, sex/relatedness checks when run
- VCF/gVCF path, index path, and annotation path
- limitations: low coverage, missing known-sites, target design gaps, or build mismatches

Clinical interpretation, pathogenicity classification, and report signing are out of scope unless the user provides a validated clinical workflow.

More from openai/plugins

Skill	Description
accessibility-and-inclusive-visualization	Make data visualizations accessible and inclusive. Use when the user needs chart or diagram accessibility guidance, text alternatives for complex visuals, color and contrast review, keyboard support, reduced-motion behavior for animation or parallax, or an accessibility QA workflow for exported figures, UML-like diagrams, and dashboards.
agent-browser	Browser automation CLI for AI agents. Use when the user needs to interact with websites, verify dev server output, test web apps, navigate pages, fill forms, click buttons, take screenshots, extract data, or automate any browser task. Also triggers when a dev server starts so you can verify it visually.
agent-browser-verify	Automated browser verification for dev servers. Triggers when a dev server starts to run a visual gut-check with agent-browser — verifies the page loads, checks for console errors, validates key UI elements, and reports pass/fail before continuing.
agents-sdk	Build AI agents on Cloudflare Workers using the Agents SDK. Load when creating stateful agents, durable workflows, real-time WebSocket apps, scheduled tasks, MCP servers, or chat applications. Covers Agent class, state management, callable RPC, Workflows integration, and React hooks. Biases towards retrieval from Cloudflare docs over pre-trained knowledge.
ai-elements	AI Elements component library guidance — pre-built React components for AI interfaces built on shadcn/ui. Use when building chat UIs, message displays, tool call rendering, streaming responses, reasoning panels, or any AI-native interface with the AI SDK.
ai-gateway	Vercel AI Gateway expert guidance. Use when configuring model routing, provider failover, cost tracking, or managing multiple AI providers through a unified API.
ai-generation-persistence	AI generation persistence patterns — unique IDs, addressable URLs, database storage, and cost tracking for every LLM generation
ai-sdk	Vercel AI SDK expert guidance. Use when building AI-powered features — chat interfaces, text generation, structured output, tool calling, agents, MCP integration, streaming, embeddings, reranking, image generation, or working with any LLM provider.
aiq-deploy	\|
aiq-research	\|