generate-thesaurus

$npx mdskill add dandye/ai-runbooks/generate-thesaurus

Build structured vocabularies from content domains instantly.

  • Extracts preferred terms and hierarchical relationships from text.
  • Depends on file system access to scan documentation directories.
  • Uses context analysis to map broader, narrower, and related concepts.
  • Delivers results in markdown, YAML, JSON, or CSV formats.

SKILL.md

.github/skills/generate-thesaurusView on GitHub ↗
---
name: generate-thesaurus
description: Generate controlled vocabulary thesaurus for content domains. Creates comprehensive thesauri with preferred terms, broader/narrower/related terms.
required_roles:
  scribe: roles/scribe.editor
personas: [information-architect, technical-writer, content-strategist]
---

# Generate Thesaurus Skill

Generate a controlled vocabulary thesaurus for a specified content domain or directory. This skill analyzes content to identify key terms and structures them into a thesaurus with relationships (broader, narrower, related terms).

## Inputs

- `PATH` - The directory or file path to analyze (e.g., "/docs/security")
- `RECURSIVE` - (Optional) Boolean, whether to include subdirectories (default: true)
- `OUTPUT_FORMAT` - (Optional) Format of the output: "markdown", "yaml", "json", "csv" (default: "markdown")

## Workflow

### Step 1: Content Analysis

Analyze the content at `PATH` to identify frequently used terms, concepts, and entities. This involves scanning documentation files (Markdown, Text, etc.) to extract potential vocabulary candidates.

### Step 2: Term Extraction & Relationship Mapping

Identify relationships between terms based on context and standard taxonomies:
- **Preferred Terms**: The standard term to be used (e.g., "Multi-Factor Authentication" instead of "MFA").
- **Broader Terms**: More general concepts (e.g., "Access Control" is broader than "Authentication").
- **Narrower Terms**: More specific sub-concepts (e.g., "Biometrics" is narrower than "Authentication").
- **Related Terms**: Associative relationships (e.g., "Identity Management" is related to "Authentication").

### Step 3: Thesaurus Generation

Format the collected terms and relationships into the requested `OUTPUT_FORMAT`.

**Example Output (Markdown):**
```markdown
# Security Thesaurus

## Authentication
*   **Scope Note**: verification of the identity of a user, process, or device
*   **Broader Term**: Access Control
*   **Narrower Terms**: Multi-Factor Authentication, Single Sign-On
*   **Related Terms**: Authorization, Identity Management
```

## Required Outputs

A `THESAURUS_DOCUMENT` in the specified `OUTPUT_FORMAT` containing:
- List of terms
- Relationships (BT, NT, RT)
- Scope notes (definitions)
- Synonyms or "Use For" entries

## Quick Reference

- **Purpose**: Ensure consistent terminology and improve content findability.
- **Best Practice**: Include scope notes for ambiguous terms.
- **Standards**: Follows ISO 25964 standards for thesaurus construction where applicable.

More from dandye/ai-runbooks

SkillDescription
analyze-content-gapsIdentify content gaps and organizational opportunities. Analyzes missing content areas, redundancies, and consolidation opportunities.
audit-contentComprehensive content quality and maintenance assessment. Evaluates documentation quality, relevance, maintenance needs, and provides actionable recommendations.
check-duplicates"Check for duplicate or similar cases. Use before deep analysis to avoid investigating the same incident twice. Takes a CASE_ID and returns list of similar cases."
close-case-artifact"Close a case or alert with proper reason and documentation. Use when triage determines an alert is FP/BTP or investigation is complete. Requires artifact ID, type, closure reason, and root cause."
cluster-documentsAutomated content similarity and grouping analysis. Groups related documents by topic, purpose, or content similarity.
confirm-action"Ask the user to confirm before taking a significant action. Use before containment, remediation, or other impactful operations to ensure analyst approval. Presents options and waits for response."
correlate-ioc"Check for existing SIEM alerts and case management entries related to IOCs. Use to understand if an indicator has triggered previous alerts or is part of ongoing investigations. Takes IOC list and returns related alerts and cases."
deep-dive-ioc"Perform exhaustive analysis of a critical IOC. Use when an IOC needs Tier 2+ investigation beyond basic enrichment - includes GTI pivoting, deep SIEM searches, correlation with related entities, and threat attribution. For escalated IOCs requiring comprehensive investigation."
design-metadata-schemaDesign comprehensive metadata frameworks. Develops structured metadata templates and tagging systems.
document-in-case"Add a comment to a case to document findings, actions, or recommendations. Use to maintain audit trail during investigations. Requires CASE_ID and comment text."