product-metrics

Name: product-metrics
Author: vm0-ai/vm0-skills

$npx mdskill add vm0-ai/vm0-skills/product-metrics

Define, instrument, and analyze product metrics across key lifecycle stages.

Helps set OKRs, design dashboards, and diagnose metric movements.
Depends on data pipelines and analytics platforms for real-time insights.
Selects KPIs using value-reflective, forward-looking, and influenceable criteria.
Delivers structured frameworks linking North Star metrics to L1 indicators.

SKILL.md

.github/skills/product-metricsView on GitHub ↗

---
name: product-metrics
description: Define, instrument, and analyze product metrics across acquisition, activation, engagement, retention, and monetization. Activate when setting OKRs, designing a metrics dashboard, running a weekly or monthly metrics review, diagnosing metric movements, choosing KPIs for a product area, building a metrics framework, or evaluating product health.
---

## Metrics Architecture

Structure product measurement into three layers, each serving a distinct purpose.

### North Star Metric

A single indicator that captures the fundamental value the product delivers. Selection criteria:

- **Value-reflective**: Increases when users extract more benefit from the product
- **Forward-looking**: Reliably predicts sustained business outcomes like revenue and retention
- **Influenceable**: The product team's work can demonstrably move it
- **Broadly understood**: Anyone in the organization can grasp its meaning and significance

Illustrative North Star choices by product category:
- Team collaboration platform: Weekly active teams where three or more members contribute
- Two-sided marketplace: Weekly completed transactions
- Enterprise SaaS: Weekly active users who execute the core workflow
- Media or content product: Weekly minutes of engaged consumption
- Developer tooling: Weekly production deployments facilitated by the tool

### L1 Indicators (Product Health)

Five to seven metrics that collectively represent the full user lifecycle. Organized by lifecycle phase:

**Acquisition** -- Are new users discovering the product?
- Volume of new registrations or trial initiations and their trajectory
- Visitor-to-registration conversion rate
- Distribution across acquisition channels
- Per-channel acquisition cost (for paid efforts)

**Activation** -- Are newcomers reaching the value threshold?
- Activation rate: fraction of new users who perform the action most predictive of retention
- Time-to-activation: elapsed duration from registration to activation
- Onboarding completion rate: fraction who finish the guided setup sequence
- First value moment: the point at which users first experience the product's core promise

**Engagement** -- Are active users deriving ongoing value?
- Active user counts at daily, weekly, and monthly granularity (DAU, WAU, MAU)
- Stickiness ratio (DAU divided by MAU): how habitual the product is
- Core action frequency: how often users perform the most meaningful operation
- Depth per session: volume of activity within a single visit
- Feature penetration: share of users who adopt specific capabilities

**Retention** -- Are users returning over time?
- Cohort retention at standard intervals: day 1, day 7, day 30, day 90
- Retention curves by signup cohort showing decay and stabilization
- Churn rate: fraction of users or revenue lost per period
- Reactivation rate: fraction of previously lapsed users who return

**Monetization** -- Is user value converting to revenue?
- Free-to-paid conversion rate (for freemium models)
- Monthly and annual recurring revenue (MRR / ARR)
- Average revenue per user or account (ARPU / ARPA)
- Expansion revenue: growth generated by existing customers
- Net revenue retention: combined effect of expansion, contraction, and churn

**Satisfaction** -- How do users perceive the experience?
- Net Promoter Score (NPS)
- Customer Satisfaction Score (CSAT)
- Support ticket volume and mean resolution time
- App store ratings and review sentiment analysis

### L2 Indicators (Diagnostic Detail)

Granular metrics used to investigate why L1 indicators move:

- Step-by-step funnel conversion rates
- Per-feature usage and adoption measurements
- Segment-level breakdowns: by plan tier, company size, geography, user role
- Technical performance: page load latency, error rates, API response times
- Content or feature-level engagement analysis: which surfaces drive the most activity

## Key Metric Deep Dives

### Active Users (DAU / WAU / MAU)

**Definition**: Unique users who perform a qualifying action within a day, week, or month.

**Critical design choices**:
- Define "active" precisely. Logging in, loading a page, and executing a core action tell fundamentally different stories.
- Match the timeframe to natural usage cadence. DAU for daily-use products (chat, email). WAU for weekly-use products (project tracking). MAU for episodic products (tax filing, travel booking).

**Interpretation guidance**:
- Stickiness (DAU/MAU) above 0.5 signals daily-habit status. Below 0.2 suggests sporadic engagement.
- Trajectory matters more than absolute level. Watch for growth, plateau, or decline.
- Segment by user archetype. Power users and occasional visitors exhibit vastly different patterns.

### Retention

**Definition**: Of users who arrived in cohort X, what percentage remain active in period Y?

**Standard measurement windows**:
- Day 1: Was the initial experience compelling enough to prompt a return?
- Day 7: Has the user begun forming a usage habit?
- Day 30: Is the user retained at a meaningful horizon?
- Day 90: Has the user become durably embedded?

**Analytical approaches**:
- Chart retention curves by cohort. Steep initial falloff signals an activation gap. Steady ongoing decline points to an engagement deficit. A flattening curve indicates a healthy stable base.
- Compare cohorts chronologically. Improving retention in newer cohorts confirms product improvements are landing.
- Segment by onboarding completion or feature adoption to isolate what behaviors predict lasting retention.

### Funnel Conversion

**Definition**: The percentage of users who advance from one lifecycle stage to the next.

**Typical funnels to instrument**:
- Visitor to registration
- Registration to activation (first value moment)
- Free user to paying customer
- Trial to subscription
- Monthly plan to annual plan

**Analytical approaches**:
- Map the entire funnel and measure conversion at every transition
- Locate the steepest drop-offs -- these represent the highest-leverage optimization targets
- Segment conversion by traffic source, plan type, and user profile; different populations convert at very different rates
- Monitor conversion trends over time to gauge whether iterative improvements are working

### Activation Rate

**Definition**: The fraction of new users who reach the experience where they first realize the product's core value.

**Identifying the activation event**:
- Compare behavioral data for retained users versus churned users. What actions distinguish the two groups?
- The activation event should strongly predict long-term retention
- It should be reachable within the first session or first few days
- Examples: created a first project, invited a collaborator, completed the primary workflow, connected an external integration

**Operational use**:
- Track activation rate for every registration cohort
- Measure time-to-activation; shorter intervals almost always correlate with better outcomes
- Design onboarding sequences that steer users toward the activation moment
- When testing onboarding changes, evaluate impact on downstream retention, not just activation rate in isolation

## Goal-Setting Methodology

### OKR Framework (Objectives and Key Results)

**Objectives**: Qualitative, motivating statements of what the team aims to accomplish.
- Memorable and directionally inspiring
- Bounded to a time period (quarter or half)
- Focused on outcomes, not feature lists

**Key Results**: Quantitative evidence that the objective has been met.
- Specific, measurable, and time-bound
- Framed as outcomes rather than outputs
- Two to four Key Results per Objective

**Worked example**:
```
Objective: Become an essential part of our users' daily routine

Key Results:
- Raise DAU/MAU stickiness from 0.35 to 0.50
- Improve 30-day retention for new cohorts from 40% to 55%
- Achieve >80% task completion rate across three primary workflows
```

### OKR Operating Principles

- Aim for ambitious-but-plausible targets. Achieving roughly 70% of a stretch OKR signals proper calibration.
- Key Results measure user and business outcomes, not team output like features shipped or story points completed.
- Constrain scope: two to three Objectives with two to four Key Results each prevents dilution.
- If the team is confident of hitting every Key Result, ambition is too low.
- Conduct a mid-period checkpoint. Reallocate effort toward off-track Key Results if warranted.
- Score honestly at period's end: 0.0-0.3 = missed, 0.4-0.6 = partial progress, 0.7-1.0 = delivered.

### Calibrating Metric Targets

- **Baseline**: Establish the current value with reliable measurement before committing to a target.
- **External benchmarks**: Reference what comparable products or industry reports indicate is achievable.
- **Existing trajectory**: If the metric already trends upward at 5% monthly, targeting 6% is not ambitious.
- **Planned investment**: Larger bets justify bolder targets.
- **Confidence bands**: Set a "commit" level (high confidence) and a "stretch" level (aspirational).

## Review Cadences

### Weekly Health Check

**Objective**: Detect anomalies early, monitor active experiments, maintain situational awareness.
**Duration**: 15-30 minutes.
**Participants**: Product manager, optionally the engineering lead.

**Agenda**:
- North Star metric: current value and week-over-week delta
- L1 indicators: flag any notable movements
- Active experiments: interim results and statistical power
- Anomaly scan: unexpected spikes, drops, or pattern breaks
- Triggered alerts: anything that crossed a monitoring threshold

**Outcome**: If something is off, open an investigation. Otherwise, log observations and proceed.

### Monthly Deep Dive

**Objective**: Assess trends in context, measure progress toward quarterly targets, identify strategic implications.
**Duration**: 30-60 minutes.
**Participants**: Product team and key stakeholders.

**Agenda**:
- Full L1 scorecard with month-over-month trends
- OKR progress: are Key Results on trajectory?
- Cohort health: are more recent cohorts outperforming earlier ones?
- Launch performance: how are recently shipped features tracking?
- Segment divergence: are any user segments behaving differently than expected?

**Outcome**: Identify one to three areas warranting deeper investigation or adjusted investment. Update priorities if metrics surface new insights.

### Quarterly Strategic Review

**Objective**: Evaluate the quarter holistically, set direction for the next period.
**Duration**: 60-90 minutes.
**Participants**: Product, engineering, design, and leadership.

**Agenda**:
- OKR final scoring for the quarter
- L1 trend analysis spanning the full quarter
- Year-over-year comparisons for context
- Competitive and market backdrop: relevant shifts and competitor moves
- Retrospective: what delivered expected results and what did not

**Outcome**: Set OKRs for the upcoming quarter. Recalibrate product strategy based on accumulated evidence.

## Dashboard Design

### Guiding Principles

A well-constructed dashboard answers "how is the product performing?" at a glance.

1. **Design from the decision backward**. Identify which decisions the dashboard informs before selecting metrics.

2. **Enforce visual hierarchy**. The highest-stakes metric gets the most prominent placement. North Star at the top, L1 indicators below, L2 detail accessible through drill-down.

3. **Always provide context**. A raw number in isolation conveys nothing. Pair every metric with: prior-period comparison, target value, and trend direction.

4. **Favor signal density over metric count**. Five to ten carefully chosen indicators outperform fifty superficial ones. Relegate the rest to a supplementary report.

5. **Standardize time windows**. Display all metrics over the same period. Mixing daily and monthly granularity on one screen breeds confusion.

6. **Use color for instant status**:
- Green: on track or trending favorably
- Yellow: warrants attention or trending flat
- Red: off track or declining

7. **Every metric must be actionable**. If the team cannot influence a measurement, it does not earn a place on the product dashboard.

### Recommended Layout

**Row 1**: North Star metric with trend line and target overlay.

**Row 2**: L1 health scorecard -- current value, period change, target, and status indicator for each metric.

**Row 3**: Key funnels -- visual conversion funnel with drop-off rates at each stage.

**Row 4**: Experiment and launch tracker -- active tests with preliminary results, recent releases with early performance data.

**Drill-down layer**: L2 diagnostic metrics, segment breakdowns, and extended time-series charts for investigation.

### Dashboard Pitfalls

- **Vanity metrics**: Cumulative totals that only climb (all-time signups, lifetime page views) without indicating health
- **Metric overload**: Dashboards that require scrolling. If it does not fit on a single screen, trim the metric set.
- **Missing baselines**: Numbers shown without prior-period comparison or target reference
- **Abandoned dashboards**: Metrics that have not been reviewed or refreshed in months
- **Activity metrics masquerading as outcomes**: Measuring internal throughput (tickets closed, pull requests merged) instead of user and business results
- **One-size-fits-all views**: Executives, product managers, and engineers need different dashboards. A single view serves none of them well.

### Alerting Strategy

Configure automated alerts for metrics that demand prompt response:

- **Threshold alerts**: A metric breaches a predefined boundary (error rate exceeds 1%, conversion falls below 5%)
- **Trend alerts**: A metric shows sustained decline across multiple consecutive periods
- **Anomaly alerts**: A metric deviates significantly from its expected range

**Alert hygiene practices**:
- Every alert must have a corresponding action plan. If nothing can be done, remove the alert.
- Review and recalibrate alerts periodically. Excessive false positives train teams to ignore all signals.
- Assign a designated responder for each alert category.
- Differentiate severity tiers. Not every alert warrants an emergency response.