sweep

$npx mdskill add anthropics/healthcare/sweep

Scope already happened. Every scoped document gets full-read; workers never skip, never block, never guess. Cost is not a concern; wall-clock is — fan wide.

SKILL.md

.github/skills/sweepView on GitHub ↗
---
name: sweep
description: Fan out over the scoped document set via the saved sweep workflow. Workers full-read their shard and write findings + citations directly via cli.ts. Recall over precision — never skip a scoped doc.
---

# Sweep

Scope already happened. Every scoped document gets full-read; workers never skip, never block, never guess. Cost is not a concern; wall-clock is — fan wide.

## Shard

```
bun <CLI> sql "SELECT sd.doc_id, v.uri, v.family, length(v.content) AS chars
               FROM scope_documents sd
               JOIN v_corpus_documents v ON v.id=sd.doc_id
                AND v.corpus=(SELECT corpus FROM runs WHERE run_id='<RUN_ID>')
               WHERE sd.scope_id=<s> ORDER BY sd.rank"
```

Target **~16 shards** (the Workflow concurrency cap) or one per document if fewer — coarser leaves slots idle, finer just adds overhead. Keep families together (a worker reading a base contract should also read its amendments). Long documents (>~150k chars): mark one shard `hunter:true` (greps for the brief's terms; doesn't read linearly).

## Launch

Call the saved workflow by path — **do not author a script inline**. Everything run-specific goes in `args` (a JSON object **value**, not a string):

```
Workflow({
  scriptPath: "<ROOT>/workflows/sweep.js",
  args: {
    cli:      "<CLI>",
    run_id:   "<RUN_ID>",
    brief:    <brief_id>,
    round:    <round>,
    scope_id: <scope_id>,
    rubric:   "<the brief's rubric, verbatim>",
    rules:    "<content of <ROOT>/workflows/reader.md — Read it verbatim>",
    shards:   [{label:"s00", doc_ids:[1,2,3]}, {label:"s01", doc_ids:[4,5], hunter:true}, …]
  }
})
```

The script handles the reader prompt, schema, and rescue pass. If you need to deviate (different rules, custom rescue logic), Read `<ROOT>/workflows/sweep.js`, edit a copy, and pass that as `script` instead — but default to the saved one.

## After

Workers wrote directly; nothing to merge. Reconcile coverage:

```
bun <CLI> sql "SELECT * FROM v_coverage_gaps WHERE run_id='<RUN_ID>'"
```

Everything's a gap → the workflow died at launch; re-call it. Partial gaps → some workers crashed; launch a small targeted Workflow for those doc_ids. No gaps → proceed to `queue-triage`.

More from anthropics/healthcare

SkillDescription
citationsHow to mint a citation. Every fact FKs to a citations row; citations verify against documents.content (never disk) at insert time and are immutable after.
contractsAnswer a question across a corpus of contract documents with verified citations. Use when the user asks what a contract says, which contracts have a clause, what changed between amendments, or any question that needs reading and citing across a set of contract files. The corpus must be on the local filesystem (see README).
fhir-developer-skill>
icd10-cm-skillExtract billable ICD-10-CM diagnosis codes from a clinical note the way a professional coder builds the claim. Use when users say "code this encounter", "assign ICD-10 codes", "what diagnosis codes apply", "code this chart", or when turning clinical documentation into claim-ready diagnosis codes.
knowledge-harvestAfter a run completes, propose durable facts learned during the run for the knowledge index. A human ratifies — always. Proposals ride the queue.
prior-auth-review-skillAutomate payer review of prior authorization (PA) requests. This skill should be used when users say "Review this PA request", "Process prior authorization for [procedure]", "Assess medical necessity", "Generate PA decision", or when processing clinical documentation for coverage policy validation and authorization decisions.
queue-triageAfter a sweep round, dedupe and triage worker unknowns into the human queue. Self-resolve the obvious (visibly); blocking items end the round.
reformulateTurn the user's question into a versioned BRIEF (rubric, stated assumptions, done-criteria, scope intent). Use before any sweep. Consults the knowledge index and may pre-scan a few documents to sharpen terms.
synthesizeAfter sweep rounds settle, turn findings into a report with cited claims, run a sampled semantic audit, and surface knowledge proposals. Analysis happens via scripts, never mental math.