medcat

$npx mdskill add mkurman/zorai/medcat

Extract clinical concepts from unstructured text with customizable ontologies.

  • Identifies medical entities like diagnoses and medications in patient notes.
  • Integrates with ICD-10, SNOMED CT, RxNorm, and UMLS standards.
  • Uses active learning to refine accuracy on custom medical datasets.
  • Outputs structured entity lists with confidence scores and CUIs.

SKILL.md

.github/skills/medcatView on GitHub ↗
---
name: medcat
description: "Medical Concept Annotation Toolkit. Trainable NLP for extracting clinical concepts from unstructured text. Supports ICD-10, SNOMED CT, RxNorm, UMLS. Active learning for custom medical ontologies."
tags: [clinical-nlp, medical-entity-extraction, icd10, snomed, umls, healthcare, zorai]
---
## Overview

MedCAT trains NLP models for extracting clinical concepts from unstructured text. Supports ICD-10, SNOMED CT, RxNorm, UMLS, and custom ontologies with active learning.

## Installation

```bash
uv pip install medcat
```

## Pre-trained Model

```python
from medcat.cat import CAT

cat = CAT.load_model_pack("medcat_model_pack.dat")
text = "Patient with type 2 diabetes and hypertension, prescribed metformin 500mg BID."
doc = cat(text)

for entity in doc.entities:
    print(f"{entity.name:<25} {entity.cui:<10} confidence={entity.confidence:.2f}")
# type 2 diabetes           D003920    confidence=0.97
# hypertension              D006973    confidence=0.99
```

## Active Learning

```python
cat.add_cui_to_category("D003920", "Diabetes Mellitus")
cat.train(text="Patient has diabetes", cui="D003920", value="Diabetes Mellitus")
unmatched = cat.get_unmatched_concepts()  # concepts needing review
```

## Workflow

1. Load a pre-trained model pack
2. Annotate clinical text -> extract UMLS CUIs
3. Map concepts to ICD-10/SNOMED/RxNorm
4. Train with active learning: correct errors, add concepts
5. Export and deploy trained model

More from mkurman/zorai

SkillDescription
account-management>
agile-scrum>
albumentationsFast image augmentation library (Albumentations). 70+ transforms for classification, segmentation, object detection, keypoints, and pose estimation. Optimized OpenCV-based pipeline with unified API across all CV tasks. Supports images, masks, bounding boxes, and keypoints simultaneously. Note: classic Albumentations (MIT) is no longer maintained; successor AlbumentationsX uses AGPL-3.0. For torchvision-native augmentations, use torchvision.transforms.v2.
aml-complianceAnti-Money Laundering (AML) and Know Your Customer (KYC) compliance workflow. Sanctions screening, PEP detection, transaction monitoring, suspicious activity reporting (SAR), and OFAC compliance.
anki-connectThis skill is for interacting with Anki through AnkiConnect, and should be used whenever a user asks to interact with Anki, including to read or modify decks, notes, cards, models, media, or sync operations.
approval-checkpoint-long-taskCanonical long-task pack for daemon-managed work with deliberate approval checkpoints, status summaries, rollback notes, and mobile-safe governance-aware updates.
auditing-goal-artifactsUse when reviewing recent zorai goal run outputs, closure markers, ledgers, or evidence bundles to judge whether completion is credible or to identify remaining uncertainty.
autogenAutoGen (Microsoft) — multi-agent conversation framework. Agent-to-agent chat, code generation & execution, tool use, group chat, and human-in-the-loop. Build collaborative AI systems with specialized agents.
backtraderPython backtesting framework for trading strategies. Data feeds, brokers, analyzers, and live trading support. Strategy development with commission models, slippage, and signal-based execution.
beautiful-mermaidRender Mermaid diagrams as SVG and PNG using the Beautiful Mermaid library. Use when the user asks to render a Mermaid diagram.