ai-product-canvas
$
npx mdskill add mohitagw15856/pm-claude-skills/ai-product-canvasDefine AI products with the same rigour as any product decision — but with additional layers for data, model, evaluation, and responsible AI. This canvas prevents the most common AI product failure: building a technically impressive feature that doesn't solve a real problem.
SKILL.md
.github/skills/ai-product-canvasView on GitHub ↗
--- name: ai-product-canvas description: "Structure AI and ML product decisions with the rigour of any product decision. Use when building AI-powered features, evaluating LLM integrations, designing AI products, or assessing AI readiness. Produces a complete AI product canvas covering problem definition, model approach, data requirements, evaluation framework, UX design, responsible AI checklist, and launch monitoring plan." --- # AI Product Canvas Skill Define AI products with the same rigour as any product decision — but with additional layers for data, model, evaluation, and responsible AI. This canvas prevents the most common AI product failure: building a technically impressive feature that doesn't solve a real problem. ## AI Product Anti-Patterns to Check First Before building, flag if any of these apply: - ❌ "We should add AI to [existing feature]" — with no user problem defined - ❌ Accuracy target undefined before build begins - ❌ No plan for what happens when the model is wrong - ❌ User-facing AI output with no human review or fallback - ❌ Training data not audited for bias or quality - ❌ No evaluation metric — "we'll know it when we see it" --- ## AI Product Canvas Output Format ### AI Product Canvas — [Feature Name] — [Date] **PM Owner:** [Name] **ML/AI Lead:** [Name] **Status:** Discovery / Design / Build / Evaluation / Live --- #### 1. Problem Definition **User problem being solved:** > [What specific situation is the user in? What job are they trying to get done?] **Why AI?** > [What makes this problem require AI vs a deterministic solution? If the answer is "because we can," stop here.] **Success for the user looks like:** > [What outcome does the user experience when the AI feature is working well?] --- #### 2. AI Approach **Task type:** - [ ] Classification - [ ] Generation (text, image, code) - [ ] Summarisation / extraction - [ ] Recommendation - [ ] Search / retrieval - [ ] Prediction / forecasting - [ ] Conversation / agent **Model approach:** - [ ] LLM API (GPT-4, Claude, Gemini, etc.) — specify: [Model name + version] - [ ] Fine-tuned model on own data - [ ] Custom model trained from scratch - [ ] RAG (retrieval-augmented generation) - [ ] Embedding + vector search **Rationale for chosen approach:** [Why this, not alternatives] --- #### 3. Data Requirements | Data Type | Source | Volume | Quality Status | Bias Risk | |---|---|---|---|---| | [Training data] | [Where it comes from] | [Volume] | [Audit status] | H/M/L | | [Evaluation data] | [Where it comes from] | [Volume] | [Audit status] | H/M/L | **Data gaps:** [What's missing and plan to get it] **Privacy considerations:** [Any PII in training or inference data] **Data ownership:** [Do we own this data? Can we use it for training?] --- #### 4. Evaluation Framework **Primary metric:** [The number that defines success — accuracy, F1, BLEU, user rating, task completion rate] **Minimum acceptable threshold:** [Below X, the feature does not ship] **Human evaluation plan:** [How will humans review model outputs? Sampling rate? Review panel?] | Evaluation Type | Method | Cadence | Owner | |---|---|---|---| | Offline (pre-launch) | [Test set, benchmark] | Pre-launch | ML Lead | | Online (post-launch) | [A/B test, user feedback] | Weekly | PM + ML | | Adversarial | [Red-team, edge cases] | Pre-launch | Safety reviewer | --- #### 5. User Experience Design **How is AI output presented?** - [ ] Direct output shown to user (high trust required) - [ ] AI-assisted with user confirmation - [ ] Suggestion user can accept/reject - [ ] Background action with audit log **Confidence and uncertainty handling:** - What happens when confidence is low? [Show alternative, ask for clarification, fallback to manual] - How is uncertainty communicated to the user? [UI pattern] **Fallback plan:** - If the model fails or returns an error: [Specific fallback behaviour] - If accuracy degrades below threshold: [Kill switch or graceful degradation plan] --- #### 6. Responsible AI Checklist - [ ] Bias audit completed on training data - [ ] Demographic fairness evaluated (does performance differ by user group?) - [ ] Hallucination / confabulation risk assessed and mitigated - [ ] User can see and correct AI output - [ ] Opt-out mechanism exists (can user disable the AI feature?) - [ ] Output provenance visible when relevant (does user know AI generated this?) - [ ] PII not used in ways user didn't consent to - [ ] Regulatory review completed (GDPR, AI Act, sector-specific) - [ ] Model cards / documentation completed --- #### 7. Launch & Monitoring Plan **Rollout:** [% of users, with staged expansion criteria] **Monitoring metrics:** - Model performance: [Metric + alert threshold] - User engagement with AI output: [Acceptance rate, override rate, feedback score] - Error rate: [% of failed inferences] - Latency: [P95 target] **Model refresh cadence:** [How often is the model retrained or updated?] **Drift detection:** [How will you know when model performance degrades in production?] --- ## Guidelines - Never skip the "Why AI?" section — it's the most important question in AI product development - The fallback UX is not optional — what happens when AI fails defines your product's trustworthiness - Responsible AI checklist must be completed before launch, not after - Include latency in success metrics — a 5-second AI response is often worse than no AI at all - Recommend starting with a human-in-the-loop design and automating only when accuracy is proven ## Required Inputs Ask the user for these if not provided: - **Feature or product description** (what the AI is intended to do) - **User problem** (what problem the AI is solving for users) - **Available data** (what training/inference data exists) - **ML/AI lead** (who owns the technical implementation) ## Quality Checks - [ ] "Why AI?" is answered clearly (not "because we can") - [ ] Minimum acceptable accuracy threshold is defined before build begins - [ ] Fallback UX is specified for model failures or low-confidence outputs - [ ] Responsible AI checklist is completed (not deferred to post-launch) - [ ] Monitoring plan includes both model performance and user engagement metrics
More from mohitagw15856/pm-claude-skills
- 360-feedback-templateDesign a 360-degree feedback survey or write a structured 360 feedback report. Use when asked to build a 360 feedback process, write 360 feedback for a colleague, design a feedback survey, or produce a feedback report. Produces either a complete survey instrument with rating scales and open-ended questions, or a structured narrative feedback report with themes, strengths, and development areas.
- ab-test-plannerDesign statistically rigorous A/B tests for product features, UI changes, onboarding flows, and pricing experiments. Use when asked to set up an experiment, design an A/B test, calculate sample size, or interpret test results. Produces a complete test plan with hypothesis, variant definitions, sample size, duration estimate, guardrail metrics, and a results interpretation guide.
- accessibility-auditGenerate a WCAG 2.2 accessibility audit checklist and remediation suggestions for any UI or design. Use when asked to audit for accessibility, check WCAG compliance, review a design for a11y issues, or create an accessibility remediation plan. Produces a prioritised checklist with pass/fail assessments and specific fixes.
- account-planBuild a structured account plan for any key customer or target account. Use when asked to create an account plan, key account strategy, strategic account review, or territory plan. Produces a complete account plan with relationship map, growth opportunities, risks, and 90-day action plan.
- aeo-optimizerOptimize an article for Answer Engine Optimization (AEO) — restructuring content so AI engines like ChatGPT, Perplexity, and Claude can extract, quote, and cite it. Rewrites headings as questions, drops 50-80 word answer capsules, audits paragraph length, and flags trust signals. Use when asked to AEO-optimize, make content AI-readable, improve AI citation chances, or adapt an article for answer engines.
- ai-ethics-reviewConduct an ethical review of an AI or ML feature, model, or product. Use when asked to run an AI ethics review, assess AI risks, audit a model for bias, or produce an AI impact assessment. Produces a structured ethics review covering fairness, transparency, privacy, safety, accountability, and societal impact with prioritised mitigations.
- ambiguity-resolverStructure vague opportunities and unclear briefs into actionable one-page problem statements. Use when asked to clarify a vague brief, frame an undefined problem, make sense of an unclear opportunity, or when the user says 'we need to figure out what to do about X' or 'I've been asked to look into Y'. Produces a structured problem brief with reframed questions, scoped boundaries, and a minimum viable research plan.
- api-docs-writerWrite clear, developer-facing API documentation. Use when asked to document an API endpoint, write API reference docs, create a developer guide, or turn a raw spec/Postman collection into documentation. Produces endpoint documentation with descriptions, parameters, request/response examples, and error codes.
- api-versioning-strategyWrite an API versioning strategy document for a service or API platform. Use when asked to define versioning policy, plan API deprecation, classify breaking changes, or document version lifecycle. Produces a complete versioning strategy with breaking-change classification table, deprecation timeline, migration guide template, and client communication template.
- architecture-decision-recordCreate an Architecture Decision Record (ADR) for any technical decision. Use when asked to document a technical decision, write an ADR, record an architecture choice, or capture why a technology or approach was selected. Produces a structured ADR with context, decision, consequences, and tradeoffs.