arckit-ai-playbook

$npx mdskill add tractorjuice/arc-kit/arckit-ai-playbook

Audit AI systems against UK Government Playbook standards.

  • Evaluates risk levels for health, safety, and rights impacts.
  • Scans projects for ARC artifacts and global policy documents.
  • Maps decision authority to required oversight protocols.
  • Outputs structured compliance reports with actionable gaps.
SKILL.md
.github/skills/arckit-ai-playbookView on GitHub ↗
---
name: arckit-ai-playbook
description: "Assess UK Government AI Playbook compliance for responsible AI deployment"
---

You are helping a UK government organization assess compliance with the UK Government AI Playbook for responsible AI deployment.

## User Input

```text
$ARGUMENTS
```

## Instructions

> **Note**: Before generating, scan `projects/` for existing project directories. For each project, list all `ARC-*.md` artifacts, check `external/` for reference documents, and check `000-global/` for cross-project policies. If no external docs exist but they would improve output, ask the user.

1. **Identify AI system context**:
   - AI system name and purpose
   - Type of AI (Generative, Predictive, Computer Vision, NLP, etc.)
   - Use case in government operations
   - Users (internal staff, citizens, affected population)
   - Decision authority level

2. **Determine risk level**:

**HIGH-RISK AI** (requires strictest oversight):

- Fully automated decisions affecting:
  - Health and safety
  - Fundamental rights
  - Access to services
  - Legal status
  - Employment
  - Financial circumstances
- Examples: Benefit eligibility, immigration decisions, medical diagnosis, predictive policing

**MEDIUM-RISK AI** (significant impact with human oversight):

- Semi-automated decisions with human review
- Significant resource allocation
- Examples: Case prioritization, fraud detection scoring, resource allocation

**LOW-RISK AI** (productivity/administrative):

- Recommendation systems with human control
- Administrative automation
- Examples: Email categorization, meeting scheduling, document summarization

3. **Read existing artifacts from the project context:**

   **MANDATORY** (warn if missing):
   - **PRIN** (Architecture Principles, in 000-global)
     - Extract: AI/ML governance standards, technology constraints, compliance requirements
     - If missing: warn user to run `$arckit-principles` first
   - **REQ** (Requirements)
     - Extract: AI/ML-related FR requirements, NFR (security, compliance, fairness), DR (data requirements)
     - If missing: warn user to run `$arckit-requirements` first

   **RECOMMENDED** (read if available, note if missing):
   - **DATA** (Data Model)
     - Extract: Training data sources, personal data, special category data, data quality
   - **RISK** (Risk Register)
     - Extract: AI-specific risks, bias risks, security risks, mitigation strategies

   **OPTIONAL** (read if available, skip silently if missing):
   - **STKE** (Stakeholder Analysis)
     - Extract: Affected populations, decision authority, accountability
   - **DPIA** (Data Protection Impact Assessment)
     - Extract: Data protection context, lawful basis, privacy risks

   **Read the template** (with user override support):
   - **First**, check if `.arckit/templates/uk-gov-ai-playbook-template.md` exists in the project root
   - **If found**: Read the user's customized template (user override takes precedence)
   - **If not found**: Read `.arckit/templates/uk-gov-ai-playbook-template.md` (default)

   > **Tip**: Users can customize templates with `$arckit-customize ai-playbook`

4. **Read external documents and policies**:
   - Read any **external documents** listed in the project context (`external/` files) — extract AI ethics policies, model cards, algorithmic impact assessments, bias testing results
   - Read any **global policies** listed in the project context (`000-global/policies/`) — extract AI governance framework, approved AI/ML platforms, responsible AI guidelines
   - Read any **enterprise standards** in `projects/000-global/external/` — extract enterprise AI strategy, responsible AI frameworks, cross-project AI maturity assessments
   - If no external docs exist but they would improve the output, ask: "Do you have any AI governance policies, model cards, or ethical AI assessments? I can read PDFs directly. Place them in `projects/{project-dir}/external/` and re-run, or skip."
   - **Citation traceability**: When referencing content from external documents, follow the citation instructions in `.arckit/references/citation-instructions.md`. Place inline citation markers (e.g., `[PP-C1]`) next to findings informed by source documents and populate the "External References" section in the template.

5. **Assess the 10 Core Principles**:

### Principle 1: Understanding AI

- Team understands AI limitations (no reasoning, contextual awareness)
- Realistic expectations (hallucinations, biases, edge cases)
- Appropriate use case for AI capabilities

### Principle 2: Lawful and Ethical Use

- **CRITICAL**: DPIA, EqIA, Human Rights assessment completed
- UK GDPR compliance
- Equality Act 2010 compliance
- Data Ethics Framework applied
- Legal/compliance team engaged early

### Principle 3: Security

- Cyber security assessment (NCSC guidance)
- AI-specific threats assessed:
  - Prompt injection
  - Data poisoning
  - Model theft
  - Adversarial attacks
  - Model inversion
- Security controls implemented
- Red teaming conducted (for high-risk)

### Principle 4: Human Control

- **CRITICAL for HIGH-RISK**: Human-in-the-loop required
- Human override capability
- Escalation process documented
- Staff trained on AI limitations
- Clear responsibilities assigned

**Human Oversight Models**:

- **Human-in-the-loop**: Review EVERY decision (required for high-risk)
- **Human-on-the-loop**: Periodic/random review
- **Human-in-command**: Can override at any time
- **Fully automated**: AI acts autonomously (HIGH-RISK - justify!)

### Principle 5: Lifecycle Management

- Lifecycle plan documented (selection → decommissioning)
- Model versioning and change management
- Monitoring and performance tracking
- Model drift detection
- Retraining schedule
- Decommissioning plan

### Principle 6: Right Tool Selection

- Problem clearly defined
- Alternatives considered (non-AI, simpler solutions)
- Cost-benefit analysis
- AI adds genuine value
- Success metrics defined
- NOT using AI just because it's trendy

### Principle 7: Collaboration

- Cross-government collaboration (GDS, CDDO, AI Standards Hub)
- Academia, industry, civil society engagement
- Knowledge sharing
- Contributing to government AI community

### Principle 8: Commercial Partnership

- Procurement team engaged early
- Contract includes AI-specific terms:
  - Performance metrics and SLAs
  - Explainability requirements
  - Bias audits
  - Data rights and ownership
  - Exit strategy (data portability)
  - Liability for AI failures

### Principle 9: Skills and Expertise

- Team composition verified:
  - AI/ML technical expertise
  - Data science
  - Ethical AI expertise
  - Domain expertise
  - User research
  - Legal/compliance
  - Cyber security
- Training provided on AI fundamentals, ethics, bias

### Principle 10: Organizational Alignment

- AI Governance Board approval
- AI strategy alignment
- Senior Responsible Owner (SRO) assigned
- Assurance team engaged
- Risk management process followed

6. **Assess the 6 Ethical Themes**:

### Theme 1: Safety, Security, and Robustness

- Safety testing (no harmful outputs)
- Robustness testing (edge cases)
- Fail-safe mechanisms
- Incident response plan

### Theme 2: Transparency and Explainability

- **MANDATORY**: Algorithmic Transparency Recording Standard (ATRS) published
- System documented publicly (where appropriate)
- Decision explanations available to affected persons
- Model card/factsheet published

### Theme 3: Fairness, Bias, and Discrimination

- Bias assessment completed
- Training data reviewed for bias
- Fairness metrics calculated across protected characteristics:
  - Gender
  - Ethnicity
  - Age
  - Disability
  - Religion
  - Sexual orientation
- Bias mitigation techniques applied
- Ongoing monitoring for bias drift

### Theme 4: Accountability and Responsibility

- Clear ownership (SRO, Product Owner)
- Decision-making process documented
- Audit trail of all AI decisions
- Incident response procedures
- Accountability for errors defined

### Theme 5: Contestability and Redress

- Right to contest AI decisions enabled
- Human review process for contested decisions
- Appeal mechanism documented
- Redress process for those harmed
- Response times defined (e.g., 28 days)

### Theme 6: Societal Wellbeing and Public Good

- Positive societal impact assessment
- Environmental impact considered (carbon footprint)
- Benefits distributed fairly
- Negative impacts mitigated
- Alignment with public values

7. **Generate comprehensive assessment**:

Create detailed report with:

**Executive Summary**:

- Overall score (X/160 points, Y%)
- Risk level (High/Medium/Low)
- Compliance status (Excellent/Good/Adequate/Poor)
- Critical issues
- Go/No-Go decision

**10 Principles Assessment** (each 0-10):

- Compliance status (✅/⚠️/❌)
- Evidence gathered
- Findings
- Gaps
- Score

**6 Ethical Themes Assessment** (each 0-10):

- Compliance status
- Evidence
- Findings
- Gaps
- Score

**Risk-Based Decision**:

- **HIGH-RISK**: MUST score ≥90%, ALL principles met, human-in-the-loop REQUIRED
- **MEDIUM-RISK**: SHOULD score ≥75%, critical principles met
- **LOW-RISK**: SHOULD score ≥60%, basic safeguards in place

**Mandatory Documentation Checklist**:

- [ ] ATRS (Algorithmic Transparency Recording Standard)
- [ ] DPIA (Data Protection Impact Assessment)
- [ ] EqIA (Equality Impact Assessment)
- [ ] Human Rights Assessment
- [ ] Security Risk Assessment
- [ ] Bias Audit Report
- [ ] User Research Report

**Action Plan**:

- High priority (before deployment)
- Medium priority (within 3 months)
- Low priority (continuous improvement)

8. **Map to existing ArcKit artifacts**:

**Link to Requirements**:

- Principle 2 (Lawful) → NFR-C-xxx (GDPR compliance requirements)
- Principle 3 (Security) → NFR-S-xxx (security requirements)
- Principle 4 (Human Control) → FR-xxx (human review features)
- Theme 3 (Fairness) → NFR-E-xxx (equity/fairness requirements)

**Link to Design Reviews**:

- Check HLD addresses AI Playbook principles
- Verify DLD includes human oversight mechanisms
- Ensure security controls for AI-specific threats

**Link to TCoP**:

- AI Playbook complements TCoP
- TCoP Point 6 (Secure) aligns with Principle 3
- TCoP Point 7 (Privacy) aligns with Principle 2

9. **Provide risk-appropriate guidance**:

**For HIGH-RISK AI systems**:

- **STOP**: Do NOT deploy without meeting ALL principles
- Human-in-the-loop MANDATORY (review every decision)
- ATRS publication MANDATORY
- DPIA, EqIA, Human Rights assessments MANDATORY
- Quarterly audits REQUIRED
- AI Governance Board approval REQUIRED
- Senior leadership sign-off REQUIRED

**For MEDIUM-RISK AI**:

- Strong human oversight required
- Critical principles must be met (2, 3, 4)
- ATRS recommended
- DPIA likely required
- Annual audits

**For LOW-RISK AI**:

- Basic safeguards sufficient
- Human oversight recommended
- Periodic review (annual)
- Continuous improvement mindset

10. **Highlight mandatory requirements**:

**ATRS (Algorithmic Transparency Recording Standard)**:

- MANDATORY for central government departments
- MANDATORY for arm's length bodies
- Publish on department website
- Update when system changes significantly

**DPIAs (Data Protection Impact Assessments)**:

- MANDATORY for AI processing personal data
- Must be completed BEFORE deployment
- Must be reviewed and updated regularly

**Equality Impact Assessments (EqIA)**:

- MANDATORY to assess impact on protected characteristics
- Must document how discrimination is prevented

**Human Rights Assessments**:

- MANDATORY for decisions affecting rights
- Must consider ECHR (European Convention on Human Rights)
- Document how rights are protected

---

**CRITICAL - Auto-Populate Document Control Fields**:

Before completing the document, populate ALL document control fields in the header:

**Construct Document ID**:

- **Document ID**: `ARC-{PROJECT_ID}-AIPB-v{VERSION}` (e.g., `ARC-001-AIPB-v1.0`)

**Populate Required Fields**:

*Auto-populated fields* (populate these automatically):

- `[PROJECT_ID]` → Extract from project path (e.g., "001" from "projects/001-project-name")
- `[VERSION]` → "1.0" (or increment if previous version exists)
- `[DATE]` / `[YYYY-MM-DD]` → Current date in YYYY-MM-DD format
- `[DOCUMENT_TYPE_NAME]` → "UK Government AI Playbook Assessment"
- `ARC-[PROJECT_ID]-AIPB-v[VERSION]` → Construct using format above
- `[COMMAND]` → "arckit.ai-playbook"

*User-provided fields* (extract from project metadata or user input):

- `[PROJECT_NAME]` → Full project name from project metadata or user input
- `[OWNER_NAME_AND_ROLE]` → Document owner (prompt user if not in metadata)
- `[CLASSIFICATION]` → Default to "OFFICIAL" for UK Gov, "PUBLIC" otherwise (or prompt user)

*Calculated fields*:

- `[YYYY-MM-DD]` for Review Date → Current date + 30 days

*Pending fields* (leave as [PENDING] until manually updated):

- `[REVIEWER_NAME]` → [PENDING]
- `[APPROVER_NAME]` → [PENDING]
- `[DISTRIBUTION_LIST]` → Default to "Project Team, Architecture Team" or [PENDING]

**Populate Revision History**:

```markdown
| 1.0 | {DATE} | ArcKit AI | Initial creation from `$arckit-ai-playbook` command | [PENDING] | [PENDING] |
```

**Populate Generation Metadata Footer**:

The footer should be populated with:

```markdown
**Generated by**: ArcKit `$arckit-ai-playbook` command
**Generated on**: {DATE} {TIME} GMT
**ArcKit Version**: {ARCKIT_VERSION}
**Project**: {PROJECT_NAME} (Project {PROJECT_ID})
**AI Model**: [Use actual model name, e.g., "claude-sonnet-4-5-20250929"]
**Generation Context**: [Brief note about source documents used]
```

---

Before writing the file, read `.arckit/references/quality-checklist.md` and verify all **Common Checks** plus the **AIPB** per-type checks pass. Fix any failures before proceeding.

11. **Write comprehensive output**:

Output location: `projects/{project-dir}/ARC-{PROJECT_ID}-AIPB-v1.0.md`

Use template structure from `uk-gov-ai-playbook-template.md`

12. **Provide next steps**:

After assessment:

- Summary of compliance level
- Critical blocking issues
- Recommended actions with priorities
- Timeline for remediation
- Next review date

## Example Usage

User: `$arckit-ai-playbook Assess AI Playbook compliance for benefits eligibility chatbot using GPT-4`

You should:

- Identify system: Benefits eligibility chatbot, Generative AI (LLM)
- Determine risk: **HIGH-RISK** (affects access to benefits - fundamental right)
- Assess 10 principles:
  - 1. Understanding AI: ⚠️ PARTIAL - team aware of hallucinations, but risk of false advice
  - 2. Lawful/Ethical: ❌ NON-COMPLIANT - DPIA not yet completed (BLOCKING)
  - 3. Security: ✅ COMPLIANT - prompt injection defenses, content filtering
  - 4. Human Control: ❌ NON-COMPLIANT - fully automated advice (BLOCKING for high-risk!)
  - 5. Lifecycle: ✅ COMPLIANT - monitoring, retraining schedule defined
  - 6. Right Tool: ⚠️ PARTIAL - AI appropriate but alternatives not fully explored
  - 7. Collaboration: ✅ COMPLIANT - engaged with GDS, DWP
  - 8. Commercial: ✅ COMPLIANT - OpenAI contract includes audit rights
  - 9. Skills: ✅ COMPLIANT - multidisciplinary team
  - 10. Organizational: ✅ COMPLIANT - SRO assigned, governance in place
- Assess 6 ethical themes:
  - 1. Safety: ⚠️ PARTIAL - content filtering but some harmful outputs in testing
  - 2. Transparency: ❌ NON-COMPLIANT - ATRS not yet published (MANDATORY)
  - 3. Fairness: ⚠️ PARTIAL - bias testing started, gaps in demographic coverage
  - 4. Accountability: ✅ COMPLIANT - clear ownership, audit trail
  - 5. Contestability: ❌ NON-COMPLIANT - no human review process (BLOCKING)
  - 6. Societal: ✅ COMPLIANT - improves access to benefits advice
- Calculate score: 92/160 (58%) - **POOR, NON-COMPLIANT**
- **CRITICAL ISSUES**:
  - **BLOCKING-01**: No DPIA completed (legal requirement)
  - **BLOCKING-02**: Fully automated advice (high-risk requires human-in-the-loop)
  - **BLOCKING-03**: No ATRS published (mandatory for central government)
  - **BLOCKING-04**: No contestability mechanism (right to human review)
- **DECISION**: ❌ **REJECTED - DO NOT DEPLOY**
- **Remediation required**:
  1. Complete DPIA immediately
  2. Implement human-in-the-loop (review all advice before shown to citizens)
  3. Publish ATRS
  4. Create contestability process
  5. Re-assess after remediation
- Write to `projects/NNN-benefits-chatbot/ARC-NNN-AIPB-v1.0.md`
- **Summary**: "HIGH-RISK AI system with 4 blocking issues. Cannot deploy until ALL principles met."

## Important Notes

- AI Playbook is **MANDATORY** guidance for all UK government AI systems
- HIGH-RISK AI cannot deploy without meeting ALL principles
- ATRS publication is MANDATORY for central government
- DPIAs are MANDATORY for AI processing personal data
- Human oversight is REQUIRED for high-risk decisions
- Non-compliance can result in legal challenges, ICO fines, public backlash
- "Move fast and break things" does NOT apply to government AI
- When in doubt, err on side of caution (add more safeguards)

- **Markdown escaping**: When writing less-than or greater-than comparisons, always include a space after `<` or `>` (e.g., `< 3 seconds`, `> 99.9% uptime`) to prevent markdown renderers from interpreting them as HTML tags or emoji

## Related Frameworks

- **Technology Code of Practice** (TCoP) - broader technology governance
- **Data Ethics Framework** - responsible data use
- **Service Standard** - service design and delivery
- **NCSC Guidance** - cyber security for AI systems
- **ICO AI Guidance** - data protection and AI

## Resources

- AI Playbook: https://www.gov.uk/government/publications/ai-playbook-for-the-uk-government
- ATRS: https://www.gov.uk/government/publications/guidance-for-organisations-using-the-algorithmic-transparency-recording-standard
- Data Ethics Framework: https://www.gov.uk/government/publications/data-ethics-framework
- ICO AI Guidance: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/
More from tractorjuice/arc-kit