sdv
$
npx mdskill add mkurman/zorai/sdvGenerate privacy-preserving synthetic tabular data for testing and analysis.
- Creates realistic datasets that mimic statistical properties without exposing real records.
- Integrates with Python via the sdv package for single, multi, and sequential data.
- Selects models like CTGAN or GaussianCopula based on data structure and complexity.
- Delivers output as synthetic dataframes with privacy evaluation scores.
SKILL.md
.github/skills/sdvView on GitHub ↗
---
name: sdv
description: "Synthetic Data Vault (SDV) — generate synthetic tabular data. Single-table, multi-table, and sequential data synthesis. CTGAN, TVAE, CopulaGAN, GaussianCopula. Privacy metrics and evaluation."
tags: [sdv, synthetic-data, data-generation, privacy, ctgan, tabular-data, zorai]
---
## Overview
The Synthetic Data Vault (SDV) generates synthetic tabular data that preserves statistical properties while protecting privacy. Supports single-table, multi-table, and sequential data generation with CTGAN, TVAE, CopulaGAN, and GaussianCopula models.
## Installation
```bash
uv pip install sdv
```
## Single-Table (CTGAN)
```python
from sdv.single_table import CTGANSynthesizer
from sdv.datasets.demo import load_demo
data, metadata = load_demo(dataset="census")
synth = CTGANSynthesizer(metadata)
synth.fit(data)
synthetic = synth.sample(num_rows=500)
print(synthetic.head())
print(f"Original columns: {data.shape}, Synthetic: {synthetic.shape}")
```
## Multi-Table
```python
from sdv.multi_table import HMA1Synthesizer
synth = HMA1Synthesizer(multi_table_metadata)
synth.fit(multi_table_data)
synthetic = synth.sample(scale=0.5)
```
## Privacy Evaluation
```python
from sdv.evaluation import evaluate
# Statistical similarity
report = evaluate(synthetic, data, metadata)
print(f"Overall score: {report.get_score():.3f}")
print(f"Column shapes: {report.get_property('Column Shapes'):.3f}")
print(f"Column pairs: {report.get_property('Column Pair Trends'):.3f}")
```
## References
- [SDV docs](https://docs.sdv.dev/sdv/)
- [SDV GitHub](https://github.com/sdv-dev/SDV)More from mkurman/zorai
- account-management>
- agile-scrum>
- albumentationsFast image augmentation library (Albumentations). 70+ transforms for classification, segmentation, object detection, keypoints, and pose estimation. Optimized OpenCV-based pipeline with unified API across all CV tasks. Supports images, masks, bounding boxes, and keypoints simultaneously. Note: classic Albumentations (MIT) is no longer maintained; successor AlbumentationsX uses AGPL-3.0. For torchvision-native augmentations, use torchvision.transforms.v2.
- aml-complianceAnti-Money Laundering (AML) and Know Your Customer (KYC) compliance workflow. Sanctions screening, PEP detection, transaction monitoring, suspicious activity reporting (SAR), and OFAC compliance.
- anki-connectThis skill is for interacting with Anki through AnkiConnect, and should be used whenever a user asks to interact with Anki, including to read or modify decks, notes, cards, models, media, or sync operations.
- approval-checkpoint-long-taskCanonical long-task pack for daemon-managed work with deliberate approval checkpoints, status summaries, rollback notes, and mobile-safe governance-aware updates.
- auditing-goal-artifactsUse when reviewing recent zorai goal run outputs, closure markers, ledgers, or evidence bundles to judge whether completion is credible or to identify remaining uncertainty.
- autogenAutoGen (Microsoft) — multi-agent conversation framework. Agent-to-agent chat, code generation & execution, tool use, group chat, and human-in-the-loop. Build collaborative AI systems with specialized agents.
- backtraderPython backtesting framework for trading strategies. Data feeds, brokers, analyzers, and live trading support. Strategy development with commission models, slippage, and signal-based execution.
- beautiful-mermaidRender Mermaid diagrams as SVG and PNG using the Beautiful Mermaid library. Use when the user asks to render a Mermaid diagram.