decompose-split-by-entity

Name: decompose-split-by-entity
Author: Memento-Teams/Memento-Widesearch

$npx mdskill add Memento-Teams/Memento-Widesearch/decompose-split-by-entity

Decomposes queries into parallel deep-dive research tasks for independent entities, ideal for generating comprehensive comparison tables.

Helps with extracting uniform high-density attributes for a discrete list of subjects, such as products or organizations.
Integrates with research tools for attribute extraction, though specific APIs are not detailed in the provided content.
Decides by identifying entities, standardizing attributes, partitioning batches, and assigning workers for targeted research.
Presents results as a unified tabular output, aggregating independent rows into a single comprehensive table.

SKILL.md

.github/skills/decompose-split-by-entityView on GitHub ↗

---
name: decompose-split-by-entity
description: Specialized decomposition strategy for split-by-entity tasks requiring deep attribute extraction for a discrete list of subjects.
---

## When to Use
Use this strategy when a query identifies a specific set of independent subjects (entities) and requests a uniform set of high-density attributes for each. This pattern is ideal when:
- The entities belong to a clear category (e.g., products, organizations, creative works).
- Each entity requires "deep-dive" research across multiple technical, financial, or historical dimensions.
- The data for one entity does not depend on or overlap with the data of another.
- The output is expected to be a comprehensive, multi-column comparison table.

## Decomposition Template
1. **Entity Enumeration:** Identify the full list of primary subjects. If the query provides a range (e.g., "all models in series X"), the first subtask must be to generate an exhaustive list of these entities.
2. **Attribute Definition:** Standardize the required data points (metrics, dates, specifications) to ensure consistency across all workers.
3. **Horizontal Partitioning:** Divide the list of entities into small batches. Assign each batch to a separate worker.
4. **Deep-Dive Extraction:** Each worker performs targeted research for their assigned entities only, focusing on filling every required attribute column.
5. **Vertical Synthesis:** A final pass aggregates the independent rows into a single unified table, ensuring formatting (units, date formats) is synchronized.

## Worker Assignment Rules
- **Batch Size:** Assign 3–5 complex entities per worker. For simpler entities (e.g., single-attribute lists), this can increase to 10. **Always prefer more workers with smaller batches** — each worker has a limited tool call budget, so smaller scope = higher completeness.
- **Specialization:** If the entities span different sub-categories or eras, group them by similarity to allow the worker to maintain context.
- **Verification:** For high-precision tasks (e.g., financial data or technical specs), assign a "Cross-Check" worker to verify 20% of the data points against primary sources.

## Required Columns Checklist
- **Primary Identifiers:** Official names, unique IDs, or parent organizations.
- **Temporal Metadata:** Launch/release dates, sunset/discontinuation dates, or specific "as of" timestamps.
- **Quantitative Metrics:** Technical specifications, financial figures, or performance scores (always include units).
- **Categorical Classifiers:** Type, status, or classification tags that allow for sorting/filtering.
- **Relational Data:** Associated people (e.g., leadership, creators) or secondary entities (e.g., locations, subsidiaries).

## Anti-Patterns
- **The "Breadth-First" Failure:** Attempting to find one specific attribute for *all* entities at once. This leads to high tool-call volume and frequent timeouts. Always research all attributes for a *subset* of entities.
- **Scope Creep:** Including "limited edition" or "variant" data when the query specifies "standard" or "core" range.
- **Attribute Omission:** Failing to capture secondary details (like "credits" or "requirements") because the worker focused only on the primary name and date.
- **Unit Inconsistency:** Mixing different measurement systems (e.g., metric vs. imperial) or date formats across different workers.
- **Missing "Zero" Values:** Leaving cells blank instead of explicitly stating "None" or "N/A" when an attribute is confirmed to be non-existent for a specific entity.

More from Memento-Teams/Memento-Widesearch