arckit-dpia

$npx mdskill add tractorjuice/arc-kit/arckit-dpia

Generate UK GDPR-compliant Data Protection Impact Assessments

  • Assesses privacy risks for high-risk personal data processing
  • Requires prior data model generation via arckit-data-model
  • Scans project artifacts and global policies for context
  • Outputs structured DPIA documents with mitigation strategies
SKILL.md
.github/skills/arckit-dpiaView on GitHub ↗
---
name: arckit-dpia
description: "Generate Data Protection Impact Assessment (DPIA) for UK GDPR Article 35 compliance"
---

You are helping an enterprise architect generate a **Data Protection Impact Assessment (DPIA)** following UK GDPR Article 35 requirements and ICO guidance.

A DPIA is a **legal requirement** under UK GDPR Article 35 for processing that is likely to result in a high risk to individuals' rights and freedoms. It systematically assesses privacy risks, evaluates necessity and proportionality, and identifies mitigations.

## User Input

```text
$ARGUMENTS
```

## Instructions

> **Note**: Before generating, scan `projects/` for existing project directories. For each project, list all `ARC-*.md` artifacts, check `external/` for reference documents, and check `000-global/` for cross-project policies. If no external docs exist but they would improve output, ask the user.

### Step 0: Read existing artifacts from the project context

**MANDATORY** (warn if missing):

- **DATA** (Data Model) — Extract: all entities with PII/special category data, data subjects, GDPR Article 6 lawful basis, Article 9 conditions, retention periods, data flows, data classifications
  - If missing: STOP and warn user to run `$arckit-data-model` first — a DPIA requires a data model to identify personal data processing

**RECOMMENDED** (read if available, note if missing):

- **PRIN** (Architecture Principles, in 000-global) — Extract: Privacy by Design principles, data minimization principles, security principles
  - If missing: warn that DPIAs should be informed by Privacy by Design principles
- **REQ** (Requirements) — Extract: DR (data requirements), NFR-SEC (security), NFR-C (compliance/GDPR)
- **STKE** (Stakeholder Analysis) — Extract: data subject categories, vulnerable groups, RACI for data governance roles (DPO, Data Controller, Data Processors)

**OPTIONAL** (read if available, skip silently):

- **RISK** (Risk Register) — Extract: data protection risks, privacy risks already identified
- **SECD** (Secure by Design) — Extract: security controls relevant to data protection

### Step 0b: Read external documents and policies

- Read any **external documents** listed in the project context (`external/` files) — extract previous DPIA findings, data processing agreements, lawful basis assessments, data flow diagrams
- Read any **global policies** listed in the project context (`000-global/policies/`) — extract organizational privacy policy, data retention schedule, data classification scheme
- Read any **enterprise standards** in `projects/000-global/external/` — extract enterprise data protection standards, privacy impact templates, cross-project DPIA benchmarks
- If no external data protection docs exist, ask: "Do you have any existing DPIAs, data processing agreements, or privacy policies? I can read PDFs directly. Place them in `projects/{project-dir}/external/` and re-run, or skip."
- **Citation traceability**: When referencing content from external documents, follow the citation instructions in `.arckit/references/citation-instructions.md`. Place inline citation markers (e.g., `[PP-C1]`) next to findings informed by source documents and populate the "External References" section in the template.

### Step 0c: Interactive Configuration

Before generating the DPIA, use the **AskUserQuestion** tool to gather the assessment scope. **Skip if the user has already specified scope in their arguments.**

**Gathering rules** (apply to all questions in this section):

- Ask the most important question first; fill in secondary details from context or reasonable defaults.
- **Maximum 2 rounds of questions.** After that, pick the best option from available context.
- If still ambiguous after 2 rounds, choose the (Recommended) option and note: *"I went with [X] — easy to adjust if you prefer [Y]."*

**Question 1** — header: `Scope`, multiSelect: false
> "What is the scope of this Data Protection Impact Assessment?"

- **Full system (Recommended)**: Assess all personal data processing across the entire system — required for new systems or major changes
- **Specific feature or module**: Assess a single feature that introduces new personal data processing (e.g., a new AI profiling feature)
- **Specific data flow**: Assess a particular data flow involving personal data (e.g., third-party data sharing, international transfer)

**Question 2** — header: `Consultation`, multiSelect: false
> "How should data subject consultation be approached?"

- **Surveys (Recommended)**: Online questionnaires to affected user groups — scalable and documented
- **Interviews**: One-on-one or small group discussions — deeper insights for sensitive processing
- **Workshops**: Facilitated sessions with representative data subjects — collaborative and thorough
- **Not applicable**: Data subjects cannot reasonably be consulted (e.g., law enforcement, national security)

Apply the user's selections: the scope determines which data model entities and processing activities to assess. The consultation approach is documented in Section 3 (Consultation) of the DPIA.

### Step 1: Identify or Create Project

Identify the target project from the hook context. If the user specifies a project that doesn't exist yet, create a new project:

1. Use Glob to list `projects/*/` directories and find the highest `NNN-*` number (or start at `001` if none exist)
2. Calculate the next number (zero-padded to 3 digits, e.g., `002`)
3. Slugify the project name (lowercase, replace non-alphanumeric with hyphens, trim)
4. Use the Write tool to create `projects/{NNN}-{slug}/README.md` with the project name, ID, and date — the Write tool will create all parent directories automatically
5. Also create `projects/{NNN}-{slug}/external/README.md` with a note to place external reference documents here
6. Set `PROJECT_ID` = the 3-digit number, `PROJECT_PATH` = the new directory path

### Step 2: Read Source Artifacts

Read all documents listed in Step 0 above. Use the extracted information for auto-population of the DPIA template.

### Step 3: DPIA Template Reading

Read the DPIA template:

**Read the template** (with user override support):

- **First**, check if `.arckit/templates/dpia-template.md` exists in the project root
- **If found**: Read the user's customized template (user override takes precedence)
- **If not found**: Read `.arckit/templates/dpia-template.md` (default)

> **Tip**: Users can customize templates with `$arckit-customize dpia`

This template has 16 major sections and uses the ICO's 9-criteria screening checklist.

### Step 4: ICO 9-Criteria Screening (Automated)

Based on the data model analysis, automatically score the ICO 9 criteria:

| # | Criterion | Scoring Logic |
|---|-----------|---------------|
| 1 | **Evaluation or scoring** | YES if: AI/ML features mentioned, profiling/scoring in requirements |
| 2 | **Automated decision-making** | YES if: Automated decisions with legal/significant effect in requirements |
| 3 | **Systematic monitoring** | YES if: Continuous tracking, surveillance, monitoring in requirements |
| 4 | **Sensitive data** | YES if: ANY special category data (Article 9) in data model |
| 5 | **Large scale** | YES if: >5000 data subjects mentioned OR "national" scope OR "all citizens" |
| 6 | **Matching datasets** | YES if: Multiple data sources/integrations in data flows |
| 7 | **Vulnerable subjects** | YES if: Children, elderly, disabled, patients identified in stakeholders |
| 8 | **Innovative technology** | YES if: AI/ML, blockchain, biometrics, new tech mentioned |
| 9 | **Prevents rights exercise** | YES if: No mechanism for SAR/deletion/portability in data model |

**DPIA Decision Rules**:

- **2+ criteria met**: DPIA REQUIRED (UK GDPR Article 35)
- **1 criterion met**: DPIA RECOMMENDED (good practice)
- **0 criteria met**: DPIA NOT REQUIRED (but consider Data Privacy Notice)

Show the screening results to the user:

```text
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📋 DPIA Screening Results (ICO 9 Criteria)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

[X] Criterion 4: Sensitive data (Special category data found: Health, Ethnicity)
[X] Criterion 7: Vulnerable subjects (Children identified in stakeholders)
[ ] Criterion 1: Evaluation/scoring (Not detected)
... [continue for all 9 criteria]

**Screening Score**: 2/9 criteria met
**Decision**: ✅ DPIA REQUIRED under UK GDPR Article 35

Proceeding to generate full DPIA...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```

If the screening shows DPIA NOT REQUIRED, ask the user if they want to proceed anyway (they may want to conduct a DPIA for good practice or to demonstrate accountability).

### Step 5: Generate DPIA Document

**CRITICAL**: Use the **Write tool** to create the DPIA document. DPIAs are typically 3,000-10,000 words and will exceed the 32K token output limit if you try to output the full document in the chat.

Generate the DPIA by:

1. **Detect version**: Before generating the document ID, check if a previous version exists:
   - Look for existing `ARC-{PROJECT_ID}-DPIA-v*.md` files in the project directory
   - **If no existing file**: Use VERSION="1.0"
   - **If existing file found**:
     - Read the existing document to understand its scope
     - Compare against current data model and requirements
     - **Minor increment** (e.g., 1.0 → 1.1): Scope unchanged — refreshed risk scores, updated mitigations, corrected details
     - **Major increment** (e.g., 1.0 → 2.0): Scope materially changed — new data categories, new processing purposes, fundamentally different risk landscape
   - For v1.1+/v2.0+: Add a Revision History entry describing what changed from the previous version

2. **Auto-populate Document Control**:
   - **Document ID**: `ARC-{PROJECT_ID}-DPIA-v{VERSION}` (e.g., `ARC-001-DPIA-v1.0`)
   - Version: ${VERSION}
   - Status: DRAFT
   - Date Created: {current_date}
   - Assessment Date: {current_date}
   - Next Review Date: {current_date + 12 months}
   - Classification: OFFICIAL-SENSITIVE

3. **Section 1: Need for DPIA**:
   - Copy screening results from Step 4
   - List all criteria that were met with evidence from data model

4. **Section 2: Description of Processing**:
   - Project Context: Summarize from user input and requirements
   - Processing Purposes: Extract from DR-xxx requirements and data model "Purpose of Processing" fields
   - Nature of Processing: Describe collection, storage, use, disclosure, deletion
   - Scope of Processing: Data subjects from stakeholder analysis, geographic scope
   - Data Categories: List all PII and special category data from data model entities
   - Data Sources: Extract from data model "Data Flow Sources"
   - Data Destinations: Extract from data model "Data Flow Destinations"
   - Retention Periods: Extract from data model retention policies

5. **Section 3: Consultation**:
   - Internal Stakeholders: Extract from stakeholder RACI (Data Controller, DPO, IT Security)
   - External Stakeholders: Data subjects consultation plans (surveys, focus groups)
   - Data Processors: List any third-party processors from integration requirements

6. **Section 4: Necessity and Proportionality**:
   - Lawful Basis: Extract GDPR Article 6 basis from each entity in data model
   - Special Category Conditions: Extract GDPR Article 9 conditions from data model
   - Necessity Test: For each processing purpose, justify why it's necessary
   - Proportionality Test: Assess if data collection is proportionate to purpose
   - Data Minimization: Review against architecture principles for minimization

7. **Section 5: Risk Assessment**:
   - For EACH entity with PII/special category data, identify risks to individuals:
     - Confidentiality risks (data breach, unauthorized access)
     - Integrity risks (data corruption, inaccurate profiling)
     - Availability risks (inability to access/port data)
   - Score each risk using DPIA risk matrix:
     - **Likelihood**: Remote, Possible, Probable
     - **Severity** (impact on individuals): Minimal, Significant, Severe
     - **Overall Risk**: Low (green), Medium (amber), High (red)
   - Link to existing risks in ARC-*-RISK-*.md if they exist

8. **Section 6: Mitigations**:
   - For each high/medium risk, propose mitigations:
     - Technical: Encryption, pseudonymization, access controls (link to secure-by-design controls)
     - Organizational: Policies, training, DPIAs for suppliers
     - Procedural: Breach notification, incident response, audit trails
   - Show residual risk after mitigations
   - Extract existing security controls from ARC-*-SECD-*.md as mitigations

9. **Section 7: ICO Consultation**:
   - If any residual risks remain HIGH after mitigations, flag for ICO prior consultation:

     ```text
     ⚠️  ICO Prior Consultation Required:
     - Risk DPIA-003 (Unauthorized profiling of children) remains HIGH after mitigations
     - Contact ICO before processing: https://ico.org.uk/make-a-complaint/your-personal-information-concerns/
     ```

10. **Section 8: Sign-off and Approval**:

- Leave signature fields blank (to be signed by Data Controller, DPO, Senior Responsible Owner)

11. **Section 9: Review and Monitoring**:
    - Set review triggers: 12 months, major system changes, data breaches, ICO guidance updates

12. **Section 10: Traceability**:
    - Link to all source artifacts (ARC-*-DATA-*.md, ARC-*-REQ-*.md, ARC-*-STKE-*.md, ARC-000-PRIN-*.md, ARC-*-RISK-*.md)
    - List all DPIA risks with unique IDs (DPIA-001, DPIA-002, etc.)

13. **Section 11: Data Subject Rights**:
    - For each GDPR right (SAR, rectification, erasure, portability, objection, restriction, automated decision-making):
      - Check if data model has implementation mechanism
      - If YES, describe how it's implemented
      - If NO, flag as a risk and recommend implementation

14. **Section 12: International Transfers**:
    - Check if data model shows any international destinations
    - If YES, identify safeguards (SCCs, BCRs, adequacy decisions)
    - If NO safeguards, flag as HIGH risk

15. **Section 13: Children's Data**:
    - If children identified in stakeholders, generate detailed assessment:
      - Age verification mechanisms
      - Parental consent
      - Child-friendly privacy notices
      - Best interests assessment

16. **Section 14: AI/Algorithmic Processing**:
    - If AI/ML detected in requirements, integrate with ai-playbook assessment:
      - Algorithmic bias risks
      - Explainability/transparency
      - Human oversight
      - Link to ATRS record if it exists

17. **Section 15: Summary and Action Plan**:
    - Summary table: Total risks, high/medium/low breakdown, key mitigations, ICO consultation needed?
    - Action plan: List all recommendations with owners and deadlines

Before writing the file, read `.arckit/references/quality-checklist.md` and verify all **Common Checks** plus the **DPIA** per-type checks pass. Fix any failures before proceeding.

Write the complete DPIA document to:

```text
projects/{project_id}/ARC-{PROJECT_ID}-DPIA-v${VERSION}.md
```

### Step 6: Risk Register Integration (Optional)

Ask the user:

```text
📊 DPIA generated with [N] risks identified.

Would you like to add DPIA risks to the project risk register?
This will create/update: projects/{project_id}/ARC-*-RISK-*.md

[Y/N]
```

If YES:

1. Read `projects/{project_id}/ARC-*-RISK-*.md` (or create from template if it doesn't exist)
2. Add each DPIA risk as a new entry with:
   - Risk ID: DPIA-001, DPIA-002, etc.
   - Category: "Data Protection"
   - Source: "DPIA Assessment"
   - Link back to DPIA document
3. Update the risk register file

### Step 7: Summary Output

**IMPORTANT**: Do NOT output the full DPIA document to the chat (it's too large). Instead, show a concise summary:

```text
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✅ DPIA Generated Successfully
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

📄 Document: projects/{project_id}/ARC-{PROJECT_ID}-DPIA-v{VERSION}.md
📋 Document ID: {document_id}
📅 Assessment Date: {date}
🔒 Classification: OFFICIAL-SENSITIVE

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 Assessment Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

**ICO Screening**: {N}/9 criteria met → DPIA REQUIRED

**Processing Overview**:
- Data Subjects: {list data subject categories}
- Personal Data: {N} entities with PII
- Special Category Data: {YES/NO} ({categories if yes})
- Lawful Basis: {primary Article 6 basis}
- Retention Period: {typical retention}

**Risk Assessment**:
- Total Risks Identified: {N}
  - 🔴 High: {N} (requires immediate action)
  - 🟠 Medium: {N} (requires mitigation)
  - 🟢 Low: {N} (accepted)

**Key Risks**:
1. DPIA-001: {risk description} - {severity}
2. DPIA-002: {risk description} - {severity}
3. DPIA-003: {risk description} - {severity}

**Mitigations Proposed**: {N} technical, organizational, and procedural controls

**ICO Prior Consultation**: {REQUIRED / NOT REQUIRED}
{If required: List residual high risks that trigger consultation}

**Data Subject Rights**:
- ✅ Implemented: {list rights with mechanisms}
- ❌ Not Implemented: {list rights needing implementation}

**Next Steps**:
1. Review and approve DPIA (Data Controller, DPO, SRO signatures)
2. {If ICO consultation needed: Contact ICO before processing}
3. Implement recommended mitigations
4. Establish 12-month review cycle
5. {If children's data: Implement age verification and parental consent}

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔗 Traceability
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

**Source Artifacts**:
- ✅ Data Model: projects/{project_id}/ARC-*-DATA-*.md
- ✅ Requirements: projects/{project_id}/ARC-*-REQ-*.md
- ✅ Stakeholders: projects/{project_id}/ARC-*-STKE-*.md
- ✅ Architecture Principles: projects/000-global/ARC-000-PRIN-*.md

**Related Artifacts**:
- Risk Register: projects/{project_id}/ARC-*-RISK-*.md ({added/updated})
- Secure by Design: projects/{project_id}/ARC-*-SECD-*.md
- {If AI: AI Playbook: projects/{project_id}/ARC-*-AIPB-*.md}
- {If AI: ATRS: projects/{project_id}/ARC-*-ATRS-*.md}

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📚 References
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

- UK GDPR Article 35: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/accountability-and-governance/data-protection-impact-assessments-dpias/
- ICO DPIA Guidance: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/accountability-and-governance/data-protection-impact-assessments-dpias/what-is-a-dpia/
- ICO Prior Consultation: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/accountability-and-governance/data-protection-impact-assessments-dpias/do-we-need-to-consult-the-ico/

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```

## Important Notes

2. **Legal Requirement**: A DPIA is **mandatory** under UK GDPR Article 35 when processing is likely to result in high risk to individuals. Failure to conduct a DPIA when required can result in ICO enforcement action.

3. **Use Write Tool**: DPIAs are large documents (typically 3,000-10,000 words). You MUST use the Write tool to create the file. Do NOT output the full DPIA in the chat.

4. **Risk Assessment Focus**: DPIA risks focus on **impact on individuals** (privacy harm, discrimination, physical harm, financial loss, reputational damage), NOT organizational risks. This is different from the risk register.

5. **Screening is Critical**: Always perform the ICO 9-criteria screening first. If the screening shows DPIA not required, don't generate a full DPIA unless the user explicitly requests it.

6. **Data Model Dependency**: A DPIA cannot be generated without a data model. The data model is the source of truth for what personal data is being processed.

7. **Bidirectional Risk Links**: DPIA risks should be added to the risk register (with "Data Protection" category), and existing privacy risks in the risk register should be referenced in the DPIA.

8. **Mitigation Sources**: Extract security controls from the Secure by Design assessment as DPIA mitigations. This creates traceability from risks → mitigations → security controls.

9. **ICO Consultation Threshold**: If ANY residual risk remains HIGH after mitigations, ICO prior consultation is required before processing can begin.

10. **Children's Data**: If processing children's data, the DPIA must include additional assessment of age verification, parental consent, best interests, and child-friendly privacy notices.

11. **AI/ML Systems**: If the system uses AI/ML for profiling, automated decision-making, or algorithmic processing, integrate with `$arckit-ai-playbook` assessment and link to ATRS record.

12. **Classification**: DPIAs contain sensitive information about data protection risks and vulnerabilities. Always classify as **OFFICIAL-SENSITIVE** at minimum.

13. **Review Cycle**: DPIAs must be reviewed regularly (recommended: 12 months) and updated when:
    - New processing activities are added
    - Data protection risks change
    - ICO guidance is updated
    - A data breach occurs

- **Markdown escaping**: When writing less-than or greater-than comparisons, always include a space after `<` or `>` (e.g., `< 3 seconds`, `> 99.9% uptime`) to prevent markdown renderers from interpreting them as HTML tags or emoji

## Success Criteria

- ✅ DPIA document created at `projects/{project_id}/ARC-{PROJECT_ID}-DPIA-v${VERSION}.md`
- ✅ ICO 9-criteria screening performed and documented
- ✅ All personal data and special category data from data model included
- ✅ Processing purposes extracted from requirements
- ✅ Data subjects and vulnerable groups identified from stakeholders
- ✅ Risk assessment completed with likelihood, severity, and overall risk scores
- ✅ Mitigations proposed for all high and medium risks
- ✅ ICO prior consultation flagged if residual high risks remain
- ✅ Data subject rights implementation assessed (SAR, deletion, portability, etc.)
- ✅ International transfer safeguards identified if applicable
- ✅ Children's data assessment completed if applicable
- ✅ AI/algorithmic processing assessment completed if applicable
- ✅ Traceability links to data model, requirements, stakeholders, principles, risk register
- ✅ Summary output shows key metrics, risks, and next steps
- ✅ Document classified as OFFICIAL-SENSITIVE
- ✅ 12-month review cycle established

## Example Usage

```text
$arckit-dpia Generate DPIA for NHS appointment booking system

$arckit-dpia Create data protection impact assessment for HMRC chatbot handling taxpayer queries

$arckit-dpia Assess DPIA necessity for Windows 11 deployment (employee data only)
```
More from tractorjuice/arc-kit