financials-normalizer

$npx mdskill add openai/role-specific-plugins/financials-normalizer

Normalize public company financials from filings and transcripts.

  • Converts raw filings into standardized Excel workbooks.
  • Depends on shared workflow source resolution and intake policies.
  • Executes preflight scripts to determine context and next actions.
  • Delivers structured XLSX artifacts unless chat output is requested.

SKILL.md

.github/skills/financials-normalizerView on GitHub ↗
---
name: financials-normalizer
description: Use when normalizing public-company financials from source materials. Do not use for private data rooms or non-financial cleanup.
---

# Financials Normalizer

## Skill Configuration

### User Context Preflight

Before searching connectors, retrieving evidence, or drafting output, run `python3 skills/user-context/scripts/user_context_preflight.py` with the shell working directory set to this plugin's root, and follow the returned `saved_context`, `source_category_plan`, and `next_action`. Set the working directory before the first attempt; do not probe alternate relative paths. Missing context must not block the requested workflow. Do not initialize state or run onboarding during ordinary workflow work.

If `next_action.id = "offer_orientation"` and the parent router has not already handled it, complete the requested work first and append its one-line optional setup offer once.

### Source Resolution

Load `../../shared/workflow-source-resolution.md`. Use `source_category_plan` lazily and attempt only the categories needed for this workflow: `company_filings_ir`, `earnings_transcripts_presentations`, `internal_research`, `portfolio_models_trackers`, and `market_data_estimates`.

## Deliverable Intake

Apply the presentation-surface precedence in `../../shared/deliverable-intake-policy.md`. This workflow's natural artifact is an XLSX normalization workbook. Do not choose chat-only output unless the user explicitly requests a lightweight response.

When invoked as support for an owning workflow, inherit its resolved deliverable preferences and do not re-prompt. Only when this skill independently owns a new standalone reader-facing normalization deliverable should it, before source gathering, analysis, or rendering, load `../../shared/deliverable-intake-policy.md` and perform its adaptive `request_user_input` preflight for materially unresolved preferences.

## Purpose

Load `shared/equity-research-support-standard.md` and `shared/support-layer-routing-contract.md` before substantial source, data, QA, or style work.


Turn messy public-company source financials into auditable, model-ready normalized statements, KPI schedules, consensus/guidance inputs, segment schedules, share-count support, net-debt and capital-allocation support, source citations, assumptions, conflicts, and QA flags for downstream Public Equity Investing workflows.

Boundary: shipped scripts create `Source_Index.csv`, `Normalized_Financials_Long.csv`, and `Normalization_Issues.csv`. Wide statements, KPI schedules, adjustment logs, conflict logs, assumption registers, and workbook/deck-ready tabs are instruction-led unless explicitly built from staging data.

For a standalone request for **model-ready normalized financials**, do not treat a long-form staging CSV alone as the analyst deliverable. Create a model-loading package from the staged rows: full-scope wide schedules relevant to the supplied financials, a disclosure/comparability bridge when definitions or presentation changed, material QA flags, and logged validation checks. Use XLSX when the user requests a workbook or will load/review the output in a workbook; otherwise a clearly organized CSV package plus a concise review summary is appropriate. This is a data-first skill; do not force an HTML artifact unless the user requests one.

## Embedded Support Routing

This is an embedded service under the owning workflow unless the user explicitly asks for standalone normalization. Preserve the `owning_workflow` internally, such as `equity-model-update`, `dcf-model-builder`, `three-statement-model-builder`, `comps-valuation`, `earnings-preview`, `earnings-deep-dive`, `memo-builder`, `thesis-tracker`, `scenario-sensitivity-generator`, `portfolio-risk-management`, or `dashboard-builder`.

For substantial embedded work, preserve `decision_impact`, `readiness_effect`, `artifact_role`, and `hidden_unless_requested` in internal context or support artifacts. Do not print those internal field names in the owning workflow's user-facing artifact. Do not own the valuation, memo, earnings, or recommendation; state in natural language how normalization issues change estimate confidence, valuation support, target support, sizing, model readiness, or circulation readiness. `Source_Index.csv`, `Normalized_Financials_Long.csv`, `Normalization_Issues.csv`, run logs, manifests, and support notes are secondary/support artifacts when invoked by an owning workflow.

## Non-Negotiables

- Preserve raw/source materials.
- Prefer user files/context, callable runtime apps/connectors when actually available, primary public sources, user-provided provider exports, then labeled assumptions. Never imply live provider access when it is unavailable.
- Never invent missing financials; mark unavailable values as `missing_required_source`.
- Keep normalized values traceable to source ID, source name/location, retrieved-at date, period, units, currency, and evidence label.
- Missing `source_id` must remain visible as `SRC-UNSPECIFIED`, produce a QA flag, and block decision-grade handoff.
- Retain conflicts rather than silently choosing values.
- Flag stale, preliminary, unaudited, OCR-derived, or low-confidence data.
- Do not infer fiscal period-end dates from quarter labels alone; use an explicit source date or mark the date missing and flag it.
- Keep issuer outlook or guidance in `kpi_schedule` with `issuer_management_claim`; reserve `consensus_estimate` with `estimate_consensus` for externally sourced consensus estimates.
- When a segment, KPI, non-GAAP definition, or balance-sheet presentation changes, preserve both bases and create a comparability bridge before calling any series model-loadable.
- Apply comparability status at the affected series or line-item level. A changed cash presentation does not recast unrelated balance-sheet rows.
- Use `comparable_rounded` when an unchanged reported series is comparable across periods but only available in rounded narrative units; disclose that it is unsuitable for exact tie-out.
- Do not backsolve an undisclosed comparable value from rounded amounts or percentage growth for model loading. A labeled directional calculation may appear separately only when useful.
- Material open exceptions must be surfaced in `QA_Flags` and in the readiness summary; an empty technical `Normalization_Issues` file does not mean the financials are clear for downstream use.

## Workflow

1. **Classify job.** Public-equity issuer financials, earnings/model update, consensus/provider export, ETF/index constituent support, portfolio/market-data support, or equity-risk debt/liquidity context.
2. **Build source index.** Capture source ID, name/type, owner/provider, period, as-of date, retrieved-at date, location, source rank, freshness, and notes.
3. **Extract long-form staging.** Use `Normalized_Financials_Long` before wide statements; preserve original line labels beside canonical labels.
4. **Normalize periods, scale, currency, signs, and labels.** Keep reported, adjusted, pro forma, provider-standardized, estimated, and analyst-adjusted values separate.
5. **Reconcile disclosure changes.** Identify renamed, regrouped, newly introduced, discontinued, recast, or definition-changed segments/KPIs/non-GAAP lines; preserve legacy and new bases; create a disclosure/comparability bridge with model treatment.
6. **Reconcile and QA.** Check subtotals, roll-forwards, balance sheet balance, cash flow bridge, units/currency, duplicate periods, missing sources, stale/conflicting values, signs, unsupported KPIs, and completeness of the intended model-loading scope. Log performed checks and results; do not claim a check count that is not preserved in an output.
7. **Produce package.** For standalone model-ready work, produce loadable wide schedules plus `QA_Flags` and `Validation_Checks` from the audited staging layer; include a disclosure/comparability bridge whenever presentation changed. For narrower extraction/support work, return only the deterministic CSV outputs or instruction-led tabs actually created.
8. **Hand off.** State what is loadable, what is audit-only, and what remains partial or blocked before routing to models, earnings, comps, memo, thesis, scenario, risk, ETF/index, or deck/report skills. Route covenant/recovery/debt-security normalization to Credit Markets.

## Sub-agent decomposition

For complex medium/large requests, use sub-agents where available; otherwise emulate the split as named workstreams. Suggested lanes: source inventory, line-item mapping, period/unit normalization, conflict log, and QA. Keep this skill as the lead: reconcile conflicts, source labels, assumptions, open items, final QA, and the user-facing answer.

When embedded in a broader workflow, "lead" means lead for normalization only; the owning workflow remains the investment-artifact owner.


## Evidence Labels

Use exact labels from `references/normalization-schema.md`, including `fact_source_reported`, `fact_provider_standardized`, `derived_calculation`, `issuer_management_claim`, `management_adjusted`, `analyst_adjusted`, `analyst_interpretation`, `assumption_user_provided`, `assumption_inferred`, `estimate_consensus`, `stale_source`, `contradicted_source`, `missing_required_source`, and `unknown`.

Confidence labels are `high`, `medium`, or `low`.

## Scripts

```bash
python scripts/normalize_extracted_financials.py path/to/input.csv --output-dir output
python scripts/validate_normalized_financials.py output/Normalized_Financials_Long.csv
```

For workbook inputs, first extract the relevant tab/range with spreadsheet tools into a table/CSV; scripts must not destructively modify workbooks.

## Final Response

Return:

1. what was normalized: entity, sources, periods, units, currency, scope;
2. outputs created;
3. what can be loaded into a model and what remains audit-only, partial, or blocked;
4. material QA findings and disclosure/comparability breaks;
5. fact versus assumption summary and validation checks actually performed;
6. recommended next step or missing source.

## Reference Map

- `references/source-protocol.md`: hierarchy, stale data, citations, conflicts.
- `references/normalization-schema.md`: output schema, signs, scales, labels.
- `references/line-item-taxonomy.md`: statement/KPI mappings.
- `references/qa-rules.md`: reconciliation tests and red flags.
- `references/integration-guide.md`: downstream Public Equity Investing handoffs.

More from openai/role-specific-plugins

SkillDescription
analyze-account-signalsAnalyze fresh signals for a named account, owner portfolio, or watchlist and turn them into evidence-backed account intelligence using active Sales source categories and user-provided context.
analyze-data-qualityAssess whether tables, query results, files, or dataframes are trustworthy enough for analysis, modeling, dashboards, experiments, or pipelines. Use for grain, freshness, nulls, duplicates, schema drift, broken joins, referential integrity, distribution shifts, leakage, backfills, source mismatches, automated quality checks, and data-quality regressions.
auditAudit or critique a product flow, journey, workflow, funnel, onboarding path, checkout path, settings path, screen, or multi-step product experience by capturing screenshots first, placing them in Figma or a local folder, then reporting UX, design, and accessibility findings from that evidence. Use when the user asks to audit, critique, review, inspect, assess, or evaluate a product experience.
build-business-caseBuild customer-led business cases, ROI narratives, value models, executive summaries, and customer-ready value stories from uneven customer context, metrics, transcripts, notes, and public evidence.
build-competitive-briefBuild a multi-competitor build-competitive-brief report, comparison matrix, and battlecard-style objection package using user-provided materials, optional connector-assisted research, and public evidence.
build-dashboardBuild source-backed analytical dashboards that help teams monitor performance, explore drivers, and act on product or business metrics. Use when the user needs a dashboard, scorecard, monitoring view, BI dashboard, MCP artifact dashboard, or Streamlit dashboard with clear metrics, filters, validation, and handoff.
build-reportBuild polished analytical reports for executive, product, business, and technical audiences, and act as the completion contract for Data Analytics report runs. Use when the final artifact needs an answer-first narrative, evidence-backed findings, charts/tables, caveats, source metadata, and either an MCP app report or an HTML report with Seaborn-generated charts.
catalyst-calendarUse when building public-equity-investing catalyst calendars. Do not use for full event underwriting; use event-driven-analyzer.
company-tearsheetUse when creating source-backed public issuer tearsheets. Do not use for private diligence, fund diligence, vendors, or market maps.
comps-valuationProduce Public Equity Investing comparable-company valuation in report or workbook mode. Use for peer selection, multiple analysis, valuation read-throughs, implied prices, comps dashboards, Excel or Sheets comps, refreshable peer tables, model updates, and comps workbook QA. Do not use for DCF-only, credit-security, or generic market commentary requests.