ab-test-stats
$
npx mdskill add guia-matthieu/clawfu-skills/ab-test-statsCalculates statistical significance for A/B tests to support data-driven decisions on results and planning.
- Helps determine if test outcomes are significant and plan sample sizes or durations for experiments.
- Integrates with Python libraries like scipy and numpy for statistical computations.
- Uses structured analysis frameworks to identify patterns and calculate statistical measures from input data.
- Presents results through command-line outputs and visualization templates for clear interpretation.
SKILL.md
.github/skills/ab-test-statsView on GitHub ↗
--- name: ab-test-stats description: "Calculate A/B test statistical significance. Use when: determining if test results are significant; calculating required sample size; estimating test duration; analyzing conversion experiments; making data-driven decisions" license: MIT metadata: author: ClawFu version: 1.0.0 mcp-server: "@clawfu/mcp-skills" --- # A/B Test Statistics Calculator > Calculate statistical significance for A/B tests - know when your results are real, not random chance. ## When to Use This Skill - **Test analysis** - Determine if results are statistically significant - **Sample planning** - Calculate required sample size before testing - **Duration estimation** - Know how long to run experiments - **Power analysis** - Ensure tests can detect meaningful differences ## What Claude Does vs What You Decide | Claude Does | You Decide | |-------------|------------| | Structures analysis frameworks | Metric definitions | | Identifies patterns in data | Business interpretation | | Creates visualization templates | Dashboard design | | Suggests optimization areas | Action priorities | | Calculates statistical measures | Decision thresholds | ## Dependencies ```bash pip install scipy numpy click ``` ## Commands ### Check Significance ```bash python scripts/main.py significance --control 1000,50 --variant 1000,65 python scripts/main.py significance --control 5000,250 --variant 5000,300 --confidence 0.99 ``` ### Calculate Sample Size ```bash python scripts/main.py sample-size --baseline 0.05 --mde 0.02 python scripts/main.py sample-size --baseline 0.10 --mde 0.01 --power 0.90 ``` ### Estimate Duration ```bash python scripts/main.py duration --traffic 1000 --baseline 0.05 --mde 0.02 ``` ## Examples ### Example 1: Analyze Test Results ```bash # Control: 1000 visitors, 50 conversions (5%) # Variant: 1000 visitors, 65 conversions (6.5%) python scripts/main.py significance --control 1000,50 --variant 1000,65 # Output: # A/B Test Results # ───────────────────────── # Control: 5.00% (50/1000) # Variant: 6.50% (65/1000) # Lift: +30.0% # # Statistical Analysis # ───────────────────────── # p-value: 0.089 # Confidence: 91.1% # Result: NOT SIGNIFICANT (need 95%) # # Recommendation: Continue test for more data ``` ### Example 2: Plan Sample Size ```bash # Baseline 5% conversion, want to detect 20% relative lift (1% absolute) python scripts/main.py sample-size --baseline 0.05 --mde 0.01 # Output: # Sample Size Calculator # ────────────────────────────── # Baseline conversion: 5.0% # Minimum detectable effect: 1.0% (20% relative) # Target conversion: 6.0% # # Required per variant: 3,842 visitors # Total required: 7,684 visitors # # At 1000 daily visitors: ~8 days ``` ## Key Concepts | Term | Definition | |------|------------| | **p-value** | Probability result is due to chance | | **Confidence** | 1 - p-value (usually want 95%+) | | **Power** | Probability of detecting real effect (usually 80%) | | **MDE** | Minimum Detectable Effect - smallest lift worth detecting | | **Lift** | Relative improvement (variant - control) / control | ## When Results Are Significant | p-value | Confidence | Verdict | |---------|------------|---------| | < 0.01 | > 99% | Highly Significant ✓ | | < 0.05 | > 95% | Significant ✓ | | < 0.10 | > 90% | Marginally Significant | | ≥ 0.10 | < 90% | Not Significant ✗ | ## Skill Boundaries ### What This Skill Does Well - Structuring data analysis - Identifying patterns and trends - Creating visualization frameworks - Calculating statistical measures ### What This Skill Cannot Do - Access your actual data - Replace statistical expertise - Make business decisions - Guarantee prediction accuracy ## Related Skills - [cohort-analysis](../cohort-analysis/) - Analyze user cohorts - [funnel-analyzer](../funnel-analyzer/) - Analyze conversion funnels ## Skill Metadata - **Mode**: centaur ```yaml category: analytics subcategory: statistics dependencies: [scipy, numpy] difficulty: intermediate time_saved: 3+ hours/week ```
More from guia-matthieu/clawfu-skills
- aarrr-metricsMeasure and optimize growth using the AARRR (Pirate Metrics) framework with stage-specific KPIs and funnel analysis
- account-healthAssess customer account health using product usage, support sentiment, payment status, and relationship signals
- ad-spend-optimizer"Analyze paid advertising performance across channels and recommend budget reallocation to maximize ROAS and minimize CAC. Use when: planning quarterly ad budget allocation, diagnosing underperforming ad channels, deciding whether to scale spend on a channel, calculating marginal ROI across Google Ads, Meta, LinkedIn, or TikTok, rebalancing media mix after performance shifts, or setting up a test-and-scale framework for new channels."
- ai-bot-log-auditUse when analyzing server logs to understand how AI crawlers (GPTBot, ClaudeBot, PerplexityBot) interact with your site. Use when optimizing content placement for LLM retrieval, diagnosing why AI search isn't citing your content, or auditing crawl patterns to find optimization gaps.
- ai-storyboard-2x2"Créez des storyboards visuellement cohérents en utilisant la technique des 2x2 Grid Shots de PJ Ace, garantissant éclairage, personnages et décors uniformes entre les plans. Use when: **Après avoir finalisé un script vidéo** - Transformer le concept en visuels; **Besoin de cohérence visuelle** - Personnages et éclairage constants entre les plans; **Préparer des assets pour animation** - Frames prêtes pour Veo, Runway, Kling; **Présenter un storyboard client** - Visualisation avant production;..."
- ai-video-concept"Développez une idée créative et structurez un script vidéo optimisé pour la génération IA, en suivant la méthode des scènes de 8 secondes de PJ Ace. Use when: **Démarrer une publicité vidéo IA** - Transformer une idée brute en script structuré; **Créer du contenu vidéo pour les réseaux sociaux** - TikTok, Reels, YouTube Shorts; **Développer un concept de campagne** - Avant de passer au storyboard; **Pitcher une idée vidéo** - Présenter un concept à un client ou une équipe; **Adapter un messag..."
- ai-video-prompting"Générez des prompts optimisés pour chaque modèle de génération vidéo IA (Veo 3, Runway Gen-3, Kling 2.6, Pika), en exploitant leurs forces spécifiques. Use when: **Animer des frames de storyboard** - Transformer des images fixes en vidéo; **Choisir le bon modèle** - Sélectionner Veo, Runway, Kling ou Pika selon le besoin; **Optimiser la qualité de génération** - Prompts structurés pour meilleurs résultats; **Créer des transitions fluides** - Scene extension, first/last frame; **Utiliser le mo..."
- ai-video-qa"Validez la qualité de vos vidéos IA avant publication avec une checklist complète couvrant technique, créatif, et positionnement marque. Use when: **Avant publication** - Dernière validation avant mise en ligne; **Revue client** - Préparer les points de feedback anticipés; **Itération qualité** - Identifier les problèmes à corriger; **Go/No-Go decision** - Décider si la vidéo est prête; **Post-mortem** - Analyser pourquoi une vidéo a (ou n'a pas) performé"
- ai-voice-design"Concevez et générez des voix IA pour vos vidéos en utilisant ElevenLabs ou Qwen3-TTS, avec clonage vocal, design par description, et synchronisation lip-sync. Use when: **Créer une voix de marque** - Définir le ton vocal pour une campagne; **Cloner une voix existante** - Reproduire une voix avec autorisation; **Designer une voix originale** - Créer une voix à partir d'une description; **Multi-personnages** - Gérer plusieurs voix dans une même vidéo; **Lip-sync vidéo IA** - Synchroniser voix e..."
- audience-research"Discover where your audience actually pays attention online using Rand Fishkin's behavioral intelligence methodology—beyond demographics to actionable media affinity data. Use when: **Find where to reach your audience** beyond Google and Facebook ads; **Discover podcasts, YouTube channels, and publications** your audience follows; **Identify influencers and accounts** with real audience overlap; **Plan PR and media outreach** with data-backed target lists; **Improve ad targeting** on YouTube,..."