graphrag-system-design

Name: graphrag-system-design
Author: lyndonkl/claude

$npx mdskill add lyndonkl/claude/graphrag-system-design

Designs production-ready GraphRAG systems with custom technology stacks.

Solves complex retrieval problems requiring multi-hop reasoning and explainability.
Integrates Neo4j, LangChain, LlamaIndex, and vector stores.
Selects patterns and pipelines based on domain requirements and latency needs.
Delivers detailed specifications covering architecture, stack choices, and deployment.

SKILL.md

.github/skills/graphrag-system-designView on GitHub ↗

---
name: graphrag-system-design
description: Designs complete GraphRAG systems integrating graph databases, vector stores, orchestration frameworks, and LLM reasoning. Guides through pattern selection, technology stack decisions, integration pipeline design, and domain-specific customizations. Use when designing GraphRAG systems, choosing technology stacks for graph-augmented retrieval, combining Neo4j with LLM, using LangChain/LlamaIndex knowledge graphs, applying community detection for RAG, building hybrid symbol-vector pipelines, or deploying production or domain-specific GraphRAG.
---

## Table of Contents
- [Workflow](#workflow)
- [GraphRAG Pattern Selection](#graphrag-pattern-selection)
- [Integration Architecture](#integration-architecture)
- [Output Template](#output-template)

# GraphRAG System Design

## Workflow

**Copy this checklist** and work through each step:

```
GraphRAG System Design Progress:
- [ ] Step 1: Analyze domain requirements
- [ ] Step 2: Select GraphRAG pattern
- [ ] Step 3: Choose technology stack
- [ ] Step 4: Design integration pipeline
- [ ] Step 5: Apply domain customizations
- [ ] Step 6: Define deployment strategy
- [ ] Step 7: Produce specification
```

**Step 1: Analyze domain requirements**

Characterize the retrieval problem: query complexity (single-hop vs multi-hop), data volume and update frequency, compliance constraints, latency requirements, and explainability needs. Determine whether graph structure adds value over flat retrieval -- multi-hop reasoning, entity disambiguation, and relationship-aware context assembly are strong signals for GraphRAG. Define the user personas and query patterns the system must serve.

**Step 2: Select GraphRAG pattern**

Choose the core retrieval architecture using the [GraphRAG Pattern Selection](#graphrag-pattern-selection) guide. Match your query patterns to the appropriate pattern: Hybrid Symbol-Vector for mixed structured/unstructured queries, Subgraph-on-Demand for focused context assembly, or Community-Based Global Summarization for broad thematic queries. For detailed pattern descriptions, see [Methodology Reference](./resources/methodology.md).

**Step 3: Choose technology stack**

Select components for each architectural layer: graph database, vector database, orchestration framework, and LLM provider. Use the [Technology Stacks Reference](./resources/technology-stacks.md) for component-by-component comparison. Key decisions: single-system vs multi-system hybrid, managed vs self-hosted, framework-based vs custom pipeline. Consider team expertise, budget constraints, and existing infrastructure.

**Step 4: Design integration pipeline**

Define the end-to-end data flow from ingestion through generation. The core pipeline stages: Ingest (raw data) -> Extract (entities and relations) -> Build KG (populate graph) -> Index (vector embeddings + graph indices) -> Retrieve (hybrid graph+vector search) -> Generate (LLM with graph-grounded context) -> Cite (provenance from graph paths). Design the query routing logic that determines when to use graph traversal, vector search, or both. See [Methodology Reference](./resources/methodology.md) for pipeline design considerations.

**Step 5: Apply domain customizations**

Adapt the generic architecture to domain-specific requirements: ontology selection (UMLS for healthcare, FIBO for finance), compliance patterns (HIPAA access control, regulatory audit trails), and domain retrieval patterns (temporal graphs for finance, layered patient graphs for clinical). See [Domain Patterns Reference](./resources/domain-patterns.md) for domain-specific guidance.

**Step 6: Define deployment strategy**

Specify the deployment architecture: graph database sizing and clustering, vector index configuration, caching strategy, batch vs real-time ingestion, monitoring and observability, and scaling plan. Define performance SLAs for query latency, throughput, and freshness. Plan for graph maintenance: incremental updates, schema evolution, and data quality monitoring.

**Step 7: Produce specification**

Compile the complete system design specification using the [Output Template](#output-template). Validate against the quality rubric at [System Design Rubric](./resources/evaluators/rubric_system_design.json). Ensure all components are connected end-to-end with clear data flows, error handling, and fallback strategies.

## GraphRAG Pattern Selection

| Pattern | Query Type | Mechanism | Best For | Trade-offs |
|---------|-----------|-----------|----------|------------|
| **Hybrid Symbol-Vector** | Mixed structured + semantic | Pre-filter by graph type/constraint then rank by embedding similarity; or broad vector search then graph-guided expansion | Systems needing both precise structural queries and fuzzy semantic search; enterprise QA with entity disambiguation | Higher complexity; requires synchronized graph + vector indices; latency depends on filter-then-rank vs expand strategy |
| **Subgraph-on-Demand** | Focused multi-hop | Build temporary query-specific subgraphs rather than querying one monolithic graph; extract relevant neighborhood, embed, retrieve | Real-time applications needing focused context; systems with frequent updates; cost-sensitive deployments | Cold-start latency for subgraph construction; requires efficient subgraph extraction; context may miss distant but relevant nodes |
| **Community-Based Global Summarization** | Broad thematic / global | Detect communities/clusters in graph, embed summaries of each community, retrieve relevant community then drill into entity details; Microsoft GraphRAG pattern | Broad "what is X about?" queries; corpus-level summarization; thematic exploration across large knowledge bases | Requires periodic community detection (batch); summaries may lose detail; community boundaries can split related concepts |

### Choosing a Pattern

- **If queries need both "find entities of type X" and "find semantically similar content"** -> Hybrid Symbol-Vector
- **If queries are focused and context window cost matters** -> Subgraph-on-Demand
- **If queries are broad, thematic, or corpus-spanning** -> Community-Based Global Summarization
- **If requirements span multiple patterns** -> Combine patterns with a query router that dispatches to the appropriate retrieval path

## Integration Architecture

A complete GraphRAG system integrates four core component layers:

```
+------------------+     +------------------+     +----------------------+     +-----------+
|  Graph Database  |     |  Vector Database |     |  Orchestration       |     |    LLM    |
|  (Structure)     |<--->|  (Semantics)     |<--->|  Framework           |<--->| (Reason)  |
|                  |     |                  |     |                      |     |           |
|  Neo4j / Tiger   |     |  Pinecone /      |     |  LangChain /         |     |  GPT-4 /  |
|  Graph / Neptune |     |  Weaviate /      |     |  LlamaIndex /        |     |  Claude / |
|  / GraphDB       |     |  Qdrant / pgvec  |     |  LangGraph / Custom  |     |  Llama   |
+------------------+     +------------------+     +----------------------+     +-----------+
        |                        |                          |
        v                        v                          v
  Graph Traversal          Embedding Search           Pipeline Logic
  Multi-hop Paths          Semantic Ranking           Query Routing
  Schema Filtering         Similarity Scores          Context Assembly
  Provenance Chains        Hybrid Re-ranking          Citation Generation
```

**Key integration decisions:**
- **Graph-first vs text-first pipeline**: Does the query first hit the graph (structured filter) or the vector store (semantic search)?
- **Single-system vs multi-system**: Use Neo4j vector index (single system) or Neo4j + Pinecone (multi-system)?
- **Framework vs custom**: LangChain/LlamaIndex for rapid development, or custom pipeline for maximum control?
- **Synchronization**: How are graph and vector indices kept consistent during updates?

## Output Template

```
GRAPHRAG SYSTEM DESIGN SPECIFICATION
======================================

Project: [Project name]
Domain: [Target domain]
Date: [Date]
Author: [Author]

1. DOMAIN REQUIREMENTS
   Query Patterns: [Single-hop / Multi-hop / Thematic / Mixed]
   Data Volume: [Document count, entity count estimates]
   Update Frequency: [Real-time / Daily / Weekly / Batch]
   Latency Requirements: [p50, p95, p99 targets]
   Compliance: [HIPAA / GDPR / SOX / None]
   Explainability: [Required / Nice-to-have / Not needed]

2. GRAPHRAG PATTERN
   Primary Pattern: [Hybrid Symbol-Vector / Subgraph-on-Demand / Community-Based]
   Secondary Pattern: [If hybrid approach, specify]
   Query Router: [How queries are dispatched to retrieval paths]
   Rationale: [Why this pattern fits the requirements]

3. TECHNOLOGY STACK
   Graph Database: [Product, version, deployment mode]
     Justification: [Why this choice]
   Vector Database: [Product, version, deployment mode]
     Justification: [Why this choice]
   Orchestration: [Framework or custom pipeline]
     Justification: [Why this choice]
   LLM Provider: [Model, API or self-hosted]
     Justification: [Why this choice]
   Supporting Infrastructure: [Cache, queue, monitoring tools]

4. INTEGRATION PIPELINE
   Ingestion:
     - Source types: [Documents, APIs, databases]
     - Processing: [Chunking strategy, metadata extraction]
   Extraction:
     - Entity extraction: [Method, model, confidence threshold]
     - Relation extraction: [Method, schema enforcement]
   Knowledge Graph Build:
     - Schema: [Node types, edge types, properties]
     - Population: [Batch / streaming, deduplication strategy]
   Indexing:
     - Graph indices: [Index types, query optimization]
     - Vector indices: [Embedding model, dimension, index type]
   Retrieval:
     - Graph retrieval: [Traversal strategy, depth limits]
     - Vector retrieval: [Top-k, similarity threshold]
     - Hybrid fusion: [How graph and vector results combine]
   Generation:
     - Context assembly: [How retrieved data becomes LLM context]
     - Prompt template: [Structure for graph-grounded generation]
   Citation:
     - Provenance: [How sources are tracked and surfaced]

5. DOMAIN CUSTOMIZATIONS
   Ontology: [Domain ontology or taxonomy used]
   Compliance Controls: [Access control, audit, encryption]
   Domain-Specific Patterns: [Temporal graphs, layered architecture, etc.]

6. DEPLOYMENT STRATEGY
   Infrastructure: [Cloud provider, regions, HA configuration]
   Scaling Plan: [Graph DB scaling, vector DB scaling, LLM scaling]
   Monitoring: [Metrics, alerts, dashboards]
   Maintenance: [Graph update strategy, schema evolution plan]

7. PERFORMANCE TARGETS
   Query Latency: [p50, p95, p99]
   Throughput: [Queries per second]
   Freshness: [Time from data change to queryable]
   Accuracy: [Retrieval precision/recall targets]

8. RISK AND MITIGATION
   - [Risk 1]: [Mitigation strategy]
   - [Risk 2]: [Mitigation strategy]
   - [Risk 3]: [Mitigation strategy]

NEXT STEPS:
- Build proof-of-concept with sample data
- Benchmark retrieval quality against baseline RAG
- Load test with production-scale data
- Iterate schema and retrieval strategy based on evaluation
```

More from lyndonkl/claude

Skill	Description
abstraction-concrete-examples	Builds structured abstraction ladders that translate high-level principles into concrete, actionable examples across 3-5 levels. Bridges communication gaps, reveals hidden assumptions, and tests whether abstract ideas work in practice. Use when explaining concepts at different expertise levels, moving between abstract principles and concrete implementation, identifying edge cases by testing ideas against scenarios, designing layered documentation, decomposing complex problems into actionable steps, or bridging strategy-execution gaps.
academic-letter-architect	Guides the creation of evidence-based academic recommendation letters, reference letters, and award nominations that combine concrete examples, meaningful comparisons, and genuine enthusiasm. Use when writing recommendation letters for students, postdocs, or colleagues, or when user mentions recommendation letter, reference, nomination, letter of support, endorsement, or needs help with strong advocacy and comparative statements.
adr-architecture	Documents significant architectural and technical decisions with full context, alternatives considered, trade-offs analyzed, and consequences understood. Creates a decision trail that helps teams understand why decisions were made. Use when choosing between technology options, making infrastructure decisions, establishing standards, migrating systems, or when user mentions ADR, architecture decision, technical decision record, or decision documentation.
adverse-selection-prior	Produces a Bayesian prior probability that an offered transaction is +EV for the recipient, given that the counterparty chose to propose it. Applies Akerlof market-for-lemons logic -- if they offered it, they believe it is +EV for them, so the prior that it is +EV for us is materially below 50%. Reusable across trade evaluation, waiver drops (another team dropping a player is also adverse selection), job-offer analysis, M&A, and any "someone offered me this" situation. Use when you receive an unsolicited trade/offer/proposal, analyzing incoming trade prior, evaluating why a counterparty proposed a deal, or when user mentions adverse selection, market for lemons, why did they offer this, incoming trade prior, they proposed it, Bayesian adjustment on received offer.
alignment-values-north-star	Creates actionable alignment frameworks that give teams a shared North Star (direction), values (guardrails), and decision tenets (behavioral standards). Enables autonomous decision-making while maintaining organizational coherence. Use when starting new teams, scaling organizations, defining culture, establishing product vision, resolving misalignment, creating strategic clarity, or when user mentions North Star, team values, mission, principles, guardrails, decision framework, or cultural alignment.
analogy-weight-check	For every analogy in a substacker draft, verifies it carries mechanical weight — the analogy does real work explaining the mechanism, not merely decorates it. Cross-references analogy-catalog.md for novelty (is this analogy reused from a prior post?) and domain fit (biology > organizational > sports preferred; physics/military disfavored). Use whenever an analogy appears in the draft. Trigger keywords: analogy weight, decorative, mechanical weight, reused analogy, catalog check, metaphor check.
answer-uncomfortable-question	Takes one strategic question about substacker ("should we launch paid?", "is this section dead?", "are we writing for the wrong audience?") and produces the mandatory evidence + reasoning + downside triad plus a recommendation. Used 3 times per Growth Strategist review. Trigger keywords: uncomfortable question, strategic question, evidence reasoning downside, triad.
attribute-performance	For each substacker post that materially over- or under-performs the rolling baseline (\|z\| ≥ 1.0), produces a plain-English attribution paragraph with calibrated confidence (high / medium / low / unexplained). Considers subject-line effect, topic zeitgeist, external share, day-of-week, length effect, and audience-notes signals. Labels unexplained outliers explicitly rather than fabricating a story. Use after compute-baseline when outlier posts exist. Trigger keywords: attribution, why did this post work, outlier explanation, performance analysis.
auction-first-price-shading	Computes the optimal shaded bid for a first-price sealed-bid auction given a true private value, an estimate of the number of competing bidders N, and a value-distribution assumption. Implements the `(N-1)/N` equilibrium shading rule for uniform private values, adjusts for log-normal or empirical value distributions, layers a risk-aversion adjustment, and caps output against the bidder's remaining budget. Domain-neutral auction theory reusable across fantasy sports (baseball FAAB, NBA/NHL waiver auctions), prediction-market limit sizing, sealed procurement bids, and any blind-bid context. Use when user mentions "first-price auction bid", "sealed bid shading", "(N-1)/N", "FAAB bid amount", "auction shading", "optimal bid first-price", "bid for sealed-bid", "blind bid sizing", or when downstream logic needs a principled shade factor rather than an ad-hoc heuristic.
auction-winners-curse-haircut	Applies a Bayesian haircut to a bid valuation for common-value auctions where winning is itself evidence the bidder over-estimated. Takes a raw valuation, a value-type classification (common_value / private_value / mixed), the number of informed bidders N, and a signal-dispersion estimate, and returns an adjusted valuation. Domain-neutral and reusable across fantasy FAAB, prediction markets, M&A bids, ad-auction budgets, and any generic bidding context. Use when user mentions "winner's curse", "common value auction", "valuation haircut", "adverse valuation", "Bayesian bid adjustment", or "over-paying in auction".