integration-e2e-testing
$
npx mdskill add shinpr/claude-code-workflows/integration-e2e-testingDesigns integration and E2E tests by applying behavior-first principles and ROI calculations for quality reviews.
- Helps with selecting high-ROI test candidates and defining test limits per feature.
- Integrates with Playwright for UI spec-driven E2E test architecture.
- Decides recommendations based on user-observable behavior and redirects details to other test types.
- Presents results through structured test skeletons and review criteria for implementation.
SKILL.md
.github/skills/integration-e2e-testingView on GitHub ↗
---
name: integration-e2e-testing
description: Integration and E2E test design principles, ROI calculation, test skeleton specification, and review criteria. Use when designing integration tests, E2E tests, or reviewing test quality.
---
# Integration and E2E Testing Principles
## References
**E2E test design with Playwright**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and Playwright test architecture.
## Test Type Definition and Limits
| Test Type | Purpose | Scope | Limit per Feature | Implementation Timing |
|-----------|---------|-------|-------------------|----------------------|
| Integration | Verify component interactions | Partial system integration | MAX 3 | Created alongside implementation |
| E2E | Verify critical user journeys | Full system | MAX 1-2 | Executed in final phase only |
## Behavior-First Principle
### Include (High ROI)
- Business logic correctness (calculations, state transitions, data transformations)
- Data integrity and persistence behavior
- User-visible functionality completeness
- Error handling behavior (what user sees/experiences)
### Redirect to Other Test Types
- External service connections → Verify via contract/interface tests
- Performance metrics → Verify via dedicated load testing
- Implementation details → Verify observable behavior instead
- UI layout specifics → Verify information availability instead
**Principle**: Test = User-observable behavior verifiable in isolated CI environment
## ROI Calculation
ROI is used to **rank candidates within the same test type** (integration candidates against each other, E2E candidates against each other). Cross-type comparison is unnecessary because integration and E2E budgets are selected independently.
```
ROI Score = Business Value × User Frequency + Legal Requirement × 10 + Defect Detection
(range: 0–120)
```
Higher ROI Score = higher priority within its test type. No normalization or capping is applied — the raw score is used directly for ranking. Deduplication is a separate step that removes candidates entirely; it does not modify scores.
### ROI Threshold for E2E
E2E tests have high ownership cost (creation, execution, and maintenance are each 3-10× higher than integration tests). To justify creation, an E2E candidate (beyond the must-keep reserved slot) requires **ROI Score > 50**.
### ROI Calculation Examples
| Scenario | BV | Freq | Legal | Defect | ROI Score | Test Type | Selection Outcome |
|----------|----|------|-------|--------|-----------|-----------|-------------------|
| Core checkout flow | 10 | 9 | true | 9 | 109 | E2E | Selected (reserved slot: user-facing multi-step journey) |
| Payment error handling | 8 | 3 | false | 7 | 31 | E2E | Below threshold (31 < 50), not selected |
| Profile save flow | 7 | 6 | false | 6 | 48 | E2E | Below threshold (48 < 50), not selected |
| DB persistence check | 8 | 8 | false | 8 | 72 | Integration | Selected (rank 1 of 3) |
| Error message display | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) |
| Optional filter toggle | 3 | 4 | false | 2 | 14 | Integration | Not selected (rank 4, budget full) |
## Multi-Step User Journey Definition
A feature qualifies as containing a **multi-step user journey** when ALL of the following are true:
1. **2+ distinct interaction boundaries** are traversed in sequence to complete a user goal. What counts as a boundary depends on the system type:
- Web: distinct routes/pages
- Mobile native: distinct screens/views
- CLI: distinct command invocations or interactive prompts
- API: distinct API calls forming a transaction (e.g., create → confirm → finalize)
2. **State carries across steps** — data produced or actions taken in one step affect what the next step accepts or displays
3. **The journey has a completion point** — a final state the user or caller reaches (e.g., confirmation page, saved record, API success response, completed workflow)
### User-Facing vs Service-Internal Journeys
Multi-step journeys are further classified for E2E budget decisions:
| Classification | Condition | E2E Reserved Slot | Example |
|---|---|---|---|
| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible | Web checkout flow, CLI setup wizard, mobile onboarding |
| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible (use integration tests) | Async job pipeline, service-to-service saga, scheduled batch processing |
This classification applies only to the reserved E2E slot and the E2E Gap Check. Service-internal journeys are still valid E2E candidates through the normal ROI > 50 path if they warrant full-system verification.
Use this definition when evaluating E2E test candidates and E2E gap detection.
## Test Skeleton Specification
### Required Comment Patterns
Each test MUST include the following annotations:
```
AC: [Original acceptance criteria text]
Behavior: [Trigger] → [Process] → [Observable Result]
@category: core-functionality | integration | edge-case | e2e
@dependency: none | [component names] | full-system
@complexity: low | medium | high
ROI: [score]
```
Use the project's comment syntax to wrap these annotations (e.g., `//` for C-family, `#` for Python/Ruby/Shell).
### Verification Items (Optional)
When verification points need explicit enumeration:
```
Verification items:
- [Item 1]
- [Item 2]
```
## EARS Format Mapping
| EARS Keyword | Test Type | Generation Approach |
|--------------|-----------|---------------------|
| **When** | Event-driven | Trigger event → verify outcome |
| **While** | State condition | Setup state → verify behavior |
| **If-then** | Branch coverage | Both condition paths verified |
| (none) | Basic functionality | Direct invocation → verify result |
## Test File Naming Convention
- Integration tests: `*.int.test.*` or `*.integration.test.*`
- E2E tests: `*.e2e.test.*`
The test runner or framework in the project determines the appropriate file extension.
## Review Criteria
### Skeleton and Implementation Consistency
| Check | Failure Condition |
|-------|-------------------|
| Behavior Verification | No assertion for "observable result" in skeleton |
| Verification Item Coverage | Listed items not all covered by assertions |
| Mock Boundary | Internal components mocked in integration test |
### Implementation Quality
| Check | Failure Condition |
|-------|-------------------|
| AAA Structure | Arrange/Act/Assert separation unclear |
| Independence | State sharing between tests, order dependency |
| Reproducibility | Date/random dependency, varying results |
| Readability | Test name doesn't match verification content |
## Quality Standards
### Required
- Each test verifies one behavior
- Clear AAA (Arrange-Act-Assert) structure
- No test interdependencies
- Deterministic execution
More from shinpr/claude-code-workflows
- ai-development-guideTechnical decision criteria, anti-pattern detection, debugging techniques, and quality check workflow. Use when making technical decisions, detecting code smells, or performing quality assurance.
- coding-principlesLanguage-agnostic coding principles for maintainability, readability, and quality. Use when implementing features, refactoring code, or reviewing code quality.
- documentation-criteriaDocumentation creation criteria including PRD, ADR, Design Doc, and Work Plan requirements with templates. Use when creating or reviewing technical documents, or determining which documents are required.
- frontend-ai-guideFrontend-specific technical decision criteria, anti-patterns, debugging techniques, and quality check workflow. Use when making frontend technical decisions or performing quality assurance.
- implementation-approachImplementation strategy selection framework. Use when planning implementation strategy, selecting development approach, or defining verification criteria.
- recipe-add-integration-testsAdd integration/E2E tests to existing codebase using Design Docs
- recipe-buildExecute decomposed tasks in autonomous execution mode
- recipe-designExecute from requirement analysis to design document creation
- recipe-diagnoseInvestigate problem, verify findings, and derive solutions
- recipe-front-buildExecute frontend implementation in autonomous execution mode