seed-data

Name: seed-data
Author: ClipboardHealth/core-utils
$npx mdskill add ClipboardHealth/core-utils/seed-data
Generate test data for HCPs, facilities, and shifts via GitHub Actions.
Creates realistic worker and workplace records for development and staging.
Executes Bash scripts to trigger the Generate Seed Data workflow.
Selects environment targets based on user requests for seed or setup.
Outputs generated records directly into the specified test environment.
SKILL.md
.github/skills/seed-dataView on GitHub ↗
---
name: seed-data
description: Trigger seed data generation for test environments via GitHub Actions. Use when asked to seed, create test data, or set up HCPs/facilities/shifts.
allowed-tools: Bash
---

# Seed Data Generation

Trigger the `Generate Seed Data` GitHub Actions workflow to create test data in development, staging, or prod-shadow environments.

## Glossary

These terms appear in scenario names and user requests:

| Term       | Meaning                                                                                     | Also known as                |
| ---------- | ------------------------------------------------------------------------------------------- | ---------------------------- |
| **HCP**    | Healthcare Professional — a worker on the supply side of the marketplace (nurse, CNA, etc.) | worker, professional, nurse  |
| **HCF**    | Healthcare Facility — a workplace on the demand side that posts shifts                      | workplace, facility          |
| **LTC**    | Long-Term Care — a type of workplace (nursing homes, skilled nursing facilities)            | nursing home, care home, SNF |
| **CNA**    | Certified Nursing Assistant — a worker qualification                                        | nursing assistant            |
| **RN**     | Registered Nurse — a worker qualification                                                   | nurse                        |
| **LPN**    | Licensed Practical Nurse — a worker qualification                                           | LVN                          |
| **Stripe** | Payment processing account used to pay workers                                              | payment account              |
| **Shift**  | A time slot at a workplace that a worker can book and work                                  |                              |

## Step-by-Step Instructions

1. **Match the user's intent** to a scenario using the lookup table below. If the request is ambiguous, present the top matches and ask the user to pick one.
2. **Determine environment** — default to `development` unless the user specifies otherwise. Valid options: `development`, `staging`, `prod-shadow`.
3. **Determine count** — default to `1` unless the user specifies a different number.
4. **Collect input data** — if the matched scenario requires `input_data` (see Input Data Requirements below), ask the user for the required fields before proceeding. Never fabricate IDs — always ask.
5. **Construct the `gh workflow run` command.** If the scenario requires `input_data` or the user specified a non-default environment/count, show the command and ask for confirmation. Otherwise, execute immediately.
6. **Execute** the command. The `gh workflow run` command prints the run URL directly — capture and display it.
7. **Wait for completion** — poll the run until it finishes, then download and display the results:

   ```bash
   # Get the run ID from the URL (last path segment), then watch it
   gh run watch <RUN_ID> --repo ClipboardHealth/cbh-core --exit-status
   ```

   - If the run **succeeds**, download the logs artifact and display its contents:

     ```bash
     gh run download <RUN_ID> --repo ClipboardHealth/cbh-core --name seed-data-logs --dir /tmp/seed-data-logs
     cat /tmp/seed-data-logs/logs.json
     ```

     Parse the JSON and present the key results to the user (e.g., created entity IDs, names, environment).

   - If the run **fails**, fetch the failed step logs and surface the error:

     ```bash
     gh run view <RUN_ID> --repo ClipboardHealth/cbh-core --log-failed
     ```

## Command Templates

### Without input_data

```bash
gh workflow run "Generate Seed Data" \
  --repo ClipboardHealth/cbh-core \
  -f environment=<ENVIRONMENT> \
  -f numberOfTestData=<COUNT> \
  -f scenario=<SCENARIO_KEY>
```

### With input_data

```bash
gh workflow run "Generate Seed Data" \
  --repo ClipboardHealth/cbh-core \
  -f environment=<ENVIRONMENT> \
  -f numberOfTestData=<COUNT> \
  -f scenario=<SCENARIO_KEY> \
  -f 'input_data=<JSON_STRING>'
```

## Scenario Lookup Table

<!-- cspell:ignore usecase -->

> **Note:** Scenario numbers 13 and 18 do not exist. If a user references them, let them know and present the full list below.

| Key                                                                  | Description                                                                  | Keyword Hints                                                           |
| -------------------------------------------------------------------- | ---------------------------------------------------------------------------- | ----------------------------------------------------------------------- |
| `scenario-1-create-hcp-without-stripe`                               | Create a worker (HCP) without a Stripe payment account                       | hcp, worker, nurse, no stripe, basic hcp, create a nurse                |
| `scenario-2-create-surgery-facility`                                 | Create a surgery center workplace                                            | surgery, surgical, facility, center                                     |
| `scenario-3-create-ltc-facility`                                     | Create a long-term care workplace (nursing home)                             | ltc, long term care, nursing home, care home, facility                  |
| `scenario-4-create-hcp-with-stripe`                                  | Create a worker (HCP) with a Stripe payment account                          | hcp with stripe, worker stripe, paid worker, nurse with payment         |
| `scenario-5-create-hcp-hcf-and-shift`                                | Create a worker + workplace + shift together (full test setup)               | hcp facility shift, full setup, end to end, everything, full test setup |
| `scenario-6-create-hcp-with-multiple-licenses`                       | Create a worker with multiple license types                                  | multiple licenses, multi-license, hcp licenses                          |
| `scenario-7-create-shift-at-clock-in-stage`                          | Create a shift that is ready to clock in                                     | clock in, shift ready, check in, shift about to start                   |
| `scenario-8-create-rate-negotiation-hcf-multiple-hcp-pair`           | Create a rate negotiation scenario with a workplace and multiple workers     | rate negotiation, pricing, negotiation                                  |
| `scenario-9-create-shift-and-perform-until-clock-out`                | Create a shift and run through the full lifecycle to clock out               | clock out, full shift, complete shift                                   |
| `scenario-10-create-shift-and-perform-until-clock-out-for-given-hcf` | Same as scenario 9 but at a specific workplace (requires facility ID)        | clock out given facility, specific facility shift                       |
| `scenario-11-create-hcp-with-stripe-near-given-facility`             | Create a worker with Stripe near a specific workplace (requires facility ID) | hcp near facility, nearby hcp, stripe near, worker near                 |
| `scenario-12-create-shift-at-clock-in-stage-for-given-hcf`           | Create a clock-in-ready shift at a specific workplace (requires facility ID) | clock in specific facility, shift at facility                           |
| `scenario-14-move-shifts-back-or-forth`                              | Move existing shifts forward or backward in time                             | move shifts, reschedule, shift time, back forth                         |
| `scenario-15-clean-old-test-data`                                    | **DESTRUCTIVE** — delete old test data                                       | clean, cleanup, delete test data, purge                                 |
| `scenario-16-create-hcf-hcp-and-shifts-in-past-week`                 | Create workplace + worker + shifts dated in the past week                    | past week, historical, backfill, past shifts                            |
| `scenario-17-recreate-hcf-hcp-and-shifts-sales-demo-usecase`         | Set up a sales demo with workplace, worker, and shifts                       | sales demo, demo, showcase, demo data, sales presentation               |
| `scenario-19-create-shift-ready-to-clock-out`                        | Create a shift that is ready to clock out                                    | ready clock out, pending clock out                                      |
| `scenario-20-seed-workplace-review-data`                             | Seed workplace review / rating data                                          | workplace review, ratings, reviews                                      |
| `scenario-21-seed-attrition-data`                                    | Seed attrition analytics data                                                | attrition, churn, retention                                             |

## Input Data Requirements

Some scenarios accept or require a JSON `input_data` field. Ask the user for these values — **never fabricate IDs**.

### Scenario 10 — Clock out shift at a specific workplace

Required fields:

- `facilityId` (string) — the target workplace's ID
- `hcpWorkerType` (string) — one of `CNA`, `RN`, `LPN`

Example:

```json
{ "facilityId": "<FACILITY_ID>", "hcpWorkerType": "CNA" }
```

### Scenario 11 — Create worker with Stripe near a specific workplace

Required fields:

- `facilityId` (string) — the workplace to create the worker near
- `hcpWorkerType` (string) — one of `CNA`, `RN`, `LPN`

Example:

```json
{ "facilityId": "<FACILITY_ID>", "hcpWorkerType": "RN" }
```

### Scenario 12 — Clock-in-ready shift at a specific workplace

Required fields:

- `facilityId` (string) — the workplace to create the shift at
- `hcpWorkerType` (string) — one of `CNA`, `RN`, `LPN`

Example:

```json
{ "facilityId": "<FACILITY_ID>", "hcpWorkerType": "LPN" }
```

### Scenario 14 — Move shifts

Required fields:

- `shiftIds` (string[]) — array of shift IDs to move

Optional fields:

- `duration` (number) — hours to move shifts by. **Positive = forward in time, negative = backward in time.**

Example:

```json
{ "shiftIds": ["<SHIFT_ID_1>", "<SHIFT_ID_2>"], "duration": 24 }
```

### Scenario 15 — Clean old test data

All fields optional:

- `emailPattern` (string) — pattern to match test emails
- `ageThreshold` (number) — age in days beyond which to clean

Example:

```json
{ "emailPattern": "+test", "ageThreshold": 30 }
```

### Scenario 21 — Seed attrition data

Optional fields:

- `facilityId` (string) — scope to a specific workplace

Example:

```json
{ "facilityId": "<FACILITY_ID>" }
```

## Behavioral Rules

1. **Default environment is `development`** — 87% of actual usage targets development. Only use other environments when explicitly requested.
2. **Default count is `1`** — override only when the user specifies a number.
3. **Never fabricate IDs** — if a scenario requires a `facilityId`, `shiftId`, or other identifier, ask the user to provide it.
4. **Confirm before executing** — show the full `gh workflow run` command and get user approval **only** when the scenario requires `input_data` or the user specified a non-default environment or count. If the scenario needs no `input_data` and defaults are used, execute immediately without confirmation.
5. **Handle ambiguity** — if the user's request matches multiple scenarios, present the top candidates with descriptions and ask them to pick.
6. **Auth errors** — if `gh` returns an authentication error, instruct the user to run `gh auth login` and retry.
7. **No `input_data` when not needed** — omit the `-f input_data=...` flag entirely for scenarios that don't require it.
8. **"Run all scenarios" is not supported via this skill** — if the user asks to run all scenarios at once, explain that this is only available through the [GitHub Actions UI](https://github.com/ClipboardHealth/cbh-core/actions/workflows/generate-seed-data.yml) by leaving the scenario dropdown blank. Do not omit the `-f scenario=...` flag to trigger all scenarios.
9. **Destructive scenario warning** — for scenario 15 (clean old test data), always display an explicit warning that this **deletes data**, confirm the target environment twice, and never run it with all optional fields left empty (require the user to specify at least `emailPattern` or `ageThreshold`).
10. **Failure handling** — always wait for the run to complete (step 7). Never just print the URL and stop.