ngs-dna-somatic-variants

Name: ngs-dna-somatic-variants
Author: openai/plugins

$npx mdskill add openai/plugins/ngs-dna-somatic-variants

Use this skill for tumor-normal or tumor-only somatic SNV/indel calling from FASTQ, BAM, or CRAM. If the request is inherited germline calling or family analysis, use `ngs-dna-germline-variants`.

SKILL.md

.github/skills/ngs-dna-somatic-variantsView on GitHub ↗

---
name: ngs-dna-somatic-variants
description: Run or plan tumor-normal, tumor-only, WGS, WES, or cancer-panel somatic variant workflows with pairing, contamination, panel-of-normals, purity, QC, and annotation checks.
---

# Somatic DNA Variants

Use this skill for tumor-normal or tumor-only somatic SNV/indel calling from FASTQ, BAM, or CRAM. If the request is inherited germline calling or family analysis, use `ngs-dna-germline-variants`.

## Essential Inputs

Confirm:

- tumor-normal, tumor-only, relapse-baseline, or multi-tumor design
- WGS, WES, or panel assay and target BED when applicable
- input type and whether reads are already aligned
- tumor/normal pairing table and sample identifiers
- reference build, known-sites, germline resource, and annotation cache
- panel-of-normals availability and matched-normal availability
- tumor purity, contamination expectations, and minimum allele fraction goals
- desired outputs: raw calls, filtered calls, VEP/SnpEff annotation, MAF, CNV/SV handoff

## Route

Prefer `nf-core/sarek` for an end-to-end public workflow when its supported callers fit the request. Use direct GATK Mutect2 or bcftools/samtools utilities for focused validation or prepared BAMs.

Preflight command:

```bash
python plugins/ngs-analysis/scripts/ngs_preflight.py --pipeline dna_somatic_variants --emit-install-plan
```

For compact local checks from prepared tumor/normal BAM/CRAM files, use the dedicated Mutect2 runner:

```bash
python plugins/ngs-analysis/scripts/run_dna_somatic_variants.py \
  --sample-sheet somatic_pairs.tsv \
  --reference-fasta reference.fa \
  --germline-resource af-only-gnomad.vcf.gz \
  --panel-of-normals pon.vcf.gz \
  --execute
```

This produces a tumor-normal/tumor-only pairing table, Mutect2 command plan, contamination/filtering artifacts, somatic QC summary, `qc/somatic_pair_review.{tsv,json}`, visualization index, and filtered VCF outputs when the local GATK resources are available. For nf-core execution, use `plugins/ngs-analysis/scripts/run_nfcore_pipeline.py --pipeline sarek`.

The direct runner also emits `resources/resource_plan.json`, `resource_manifest.tsv`, `resource_env.sh`, and `resource_readiness.md`. The resource check is advisory by default so custom or reduced references can still be planned; add `--genome-build`, `--bundle-root <bundle>=<path>`, and `--require-resource-plan` when missing registered reference bundles should block readiness.

## Decision Points

- Verify tumor-normal pair metadata before execution. A swapped or missing normal changes the biological meaning of the calls.
- For tumor-only analysis, explicitly state the false-positive risk and require a germline resource plus careful filtering.
- Use panel-of-normals when available and reference-matched; do not reuse a PON across incompatible capture kits or genome builds.
- Track contamination, orientation bias, strand artifacts, mapping quality, coverage, tumor purity, and allele-fraction filters.
- Keep germline filtering separate from somatic interpretation; avoid presenting tumor-only calls as confirmed somatic without supporting evidence.

## Outputs

Produce:

- validated pairing/sample sheet
- caller/filter settings and reference/resource manifest
- QC summary: tumor/normal depth, contamination, duplication, insert size, on-target rate for panels/WES
- per-pair review table covering matched-normal state, PON/germline-resource availability, contamination-table status, filtered VCF status, and parsed variant counts
- VCF/MAF/annotation paths and a filtered-vs-raw call count summary
- caveats for tumor-only calls, low-purity tumors, low-depth regions, or missing matched normals

Clinical actionability and treatment recommendations are out of scope unless the user supplies a validated clinical interpretation workflow.

More from openai/plugins

Skill	Description
accessibility-and-inclusive-visualization	Make data visualizations accessible and inclusive. Use when the user needs chart or diagram accessibility guidance, text alternatives for complex visuals, color and contrast review, keyboard support, reduced-motion behavior for animation or parallax, or an accessibility QA workflow for exported figures, UML-like diagrams, and dashboards.
agent-browser	Browser automation CLI for AI agents. Use when the user needs to interact with websites, verify dev server output, test web apps, navigate pages, fill forms, click buttons, take screenshots, extract data, or automate any browser task. Also triggers when a dev server starts so you can verify it visually.
agent-browser-verify	Automated browser verification for dev servers. Triggers when a dev server starts to run a visual gut-check with agent-browser — verifies the page loads, checks for console errors, validates key UI elements, and reports pass/fail before continuing.
agents-sdk	Build AI agents on Cloudflare Workers using the Agents SDK. Load when creating stateful agents, durable workflows, real-time WebSocket apps, scheduled tasks, MCP servers, or chat applications. Covers Agent class, state management, callable RPC, Workflows integration, and React hooks. Biases towards retrieval from Cloudflare docs over pre-trained knowledge.
ai-elements	AI Elements component library guidance — pre-built React components for AI interfaces built on shadcn/ui. Use when building chat UIs, message displays, tool call rendering, streaming responses, reasoning panels, or any AI-native interface with the AI SDK.
ai-gateway	Vercel AI Gateway expert guidance. Use when configuring model routing, provider failover, cost tracking, or managing multiple AI providers through a unified API.
ai-generation-persistence	AI generation persistence patterns — unique IDs, addressable URLs, database storage, and cost tracking for every LLM generation
ai-sdk	Vercel AI SDK expert guidance. Use when building AI-powered features — chat interfaces, text generation, structured output, tool calling, agents, MCP integration, streaming, embeddings, reranking, image generation, or working with any LLM provider.
aiq-deploy	\|
aiq-research	\|