ai-test
$
npx mdskill add arcasilesgroup/ai-engineering/ai-testEnforce TDD and generate tests across multiple programming languages.
- Executes test-first development cycles for new features.
- Analyzes coverage gaps and missing edge cases in codebases.
- Determines test strategy based on project language and requirements.
- Delivers executable test files with immediate execution results.
SKILL.md
.github/skills/ai-testView on GitHub ↗
--- name: ai-test description: Writes tests, enforces TDD (RED-GREEN-REFACTOR), analyzes coverage gaps, defines test strategy across Python, TypeScript, .NET, Rust, Go. Trigger for 'add tests for', 'write a test', 'I need 80 percent coverage', 'plan my test approach', 'TDD this'. Not for failing tests where the fix is unclear; use /ai-debug instead. Not for AI reliability over time; use /ai-reliability-eval instead. effort: mid argument-hint: "plan|run|gap|tdd [target]" mode: agent model_tier: sonnet mirror_family: copilot-skills generated_by: ai-eng sync canonical_source: .claude/skills/ai-test/SKILL.md edit_policy: generated-do-not-edit --- # Test ## Purpose TDD enforcement and testing skill. Tests are executable specifications -- they define what the system does before the system does it. Maximum confidence per minute of developer time. ## When to Use - `tdd`: driving new features test-first (RED-GREEN-REFACTOR) - `run`: writing and executing tests for existing code - `gap`: analyzing coverage gaps and missing edge cases - `plan`: designing test strategy before writing tests ## Process Step 0 (load contexts): read `.ai-engineering/manifest.yml` `providers.stacks`; load `.ai-engineering/overrides/<stack>/conventions.md` for each stack and `.ai-engineering/overrides/_shared/conventions.md`; load `.ai-engineering/team/*.md` for team conventions. ### Mode: tdd (RED-GREEN-REFACTOR) For TDD mode, follow `handlers/tdd.md` for the full RED-GREEN-REFACTOR flow. ### Mode: run 1. Detect test framework from project files 2. Follow existing conventions (directory structure, naming, fixtures) 3. Write tests using AAA pattern with descriptive names 4. Run with stack-appropriate command 5. Report results: pass/fail count, coverage delta ### Mode: gap 1. Run coverage tool with branch coverage enabled 2. Identify untested critical paths (business logic > glue code) 3. Check for missing edge cases: null, empty, boundary, error paths 4. Produce gap report with prioritized recommendations ### Mode: plan 1. Map the testing surface (modules, public APIs, critical paths) 2. Assign test categories: unit, integration, e2e 3. Define coverage targets per module 4. Identify infrastructure needs (test containers, fixtures, fakes) ## Stack Commands | Stack | Runner | Coverage | Async | |-------|--------|----------|-------| | Python | `uv run pytest` | `pytest-cov` (branch=true) | `asyncio_mode = "auto"` | | TypeScript | `vitest` or `jest` | `c8` / `istanbul` | `async/await` | | .NET | `dotnet test` + xUnit | `coverlet` | `async Task` | | Rust | `cargo test` | `cargo tarpaulin` | `#[tokio::test]` | | Go | `go test ./...` | `go test -cover` | goroutine tests | ## Testing Rules **Fakes over mocks**. Mocks test implementation details. Fakes implement the same interface. Mocks are acceptable ONLY for: 1. Verifying something was NOT called 2. Simulating transient errors for retry logic 3. Third-party libraries (but wrap in your own adapter first) **AAA pattern** (non-negotiable): ```python # Arrange -- set up inputs and dependencies # Act -- call the function under test # Assert -- verify the outcome ``` **Name pattern**: `test_<unit>_<scenario>_<expected_outcome>` - Good: `test_parse_email_rejects_missing_at_symbol` - Bad: `test_parse_email`, `test_1`, `test_it_works` ## Anti-Patterns (Reject These) | Anti-Pattern | Why It Fails | |-------------|-------------| | Testing the mock | Proves the mock works, not the code | | No-op test (assert True) | Tests nothing, inflates coverage | | Testing implementation | Breaks on refactor, proves nothing about behavior | | Huge test setup | Design is too coupled -- simplify the interface | | sleep() for sync | Flaky -- use events, barriers, wait_for | | Exact float comparison | Flaky -- use approx/closeTo | ## Iron Law If tests are wrong, escalate to the user. NEVER weaken, skip, or modify tests to make implementation easier -- tests are the contract; bending them hides bugs. "Tests are wrong" means the requirement changed -- not that passing them is hard. ## Common Mistakes - Writing tests after implementation (tests-after prove what IS, not what SHOULD be) - Testing private methods (test the public API) - 100% coverage with meaningless assertions - Skipping edge cases (null, empty, boundary, concurrent access) - Not running ALL tests after changes ## Handlers | Handler | File | Activation | |---------|------|-----------| | E2E Testing | `handlers/e2e.md` | Activated when `*.spec.ts`, `playwright.config.ts`, or `e2e/` directory detected | | TDD Mode | `handlers/tdd.md` | Activated when `mode=tdd` | ## Examples ### Example 1 — TDD a new feature User: "I'm building a JWT validator. Walk me through TDD." ``` /ai-test tdd jwt-validator ``` RED: writes failing tests for valid token, expired token, malformed signature. Confirms FAIL for the expected reason. GREEN: hands off to `ai-build` for minimal implementation. REFACTOR: stays green. ### Example 2 — coverage gap analysis User: "where am I light on tests?" ``` /ai-test gap ``` Runs the stack-specific coverage tool, ranks files by coverage delta, suggests the highest-leverage test to add next. ## Integration Called by: `/ai-build` (build tasks), `/ai-build` (TDD mode), user directly. Calls: stack-specific test runners. See also: `/ai-debug`, `/ai-verify`, `/ai-reliability-eval`. $ARGUMENTS
More from arcasilesgroup/ai-engineering
- ai-adviseProactive governance advisor — checks standards, decisions, and quality trends during development. Always advisory, NEVER blocks. Three modes: `advise` (post-edit), `gate` (pre-dispatch), `drift` (on-demand decision audit). Trigger for 'governance check', 'advise on this change', 'check for drift', 'is this aligned with active decisions', 'shift-left advisory'. Not for blocking gates — use /ai-verify. Not for narrative code review — use /ai-review.
- ai-analyze-permissionsUse when Claude Code keeps asking to approve commands you have already approved, when settings.local.json has grown large, or when you want to consolidate permission grants into wildcard patterns. Trigger for 'too many permission prompts', 'clean up permissions', 'audit my settings', 'consolidate allow rules'. Claude Code only — not available in GitHub Copilot, Antigravity, or Codex.
- ai-animationDesigns motion, transitions, and micro-interactions for UI components: spring animations, gestures, easing, staggers — taste-driven detail compounding. Trigger for 'animate this', 'add transitions', 'micro-interactions for', 'gesture design', 'swipe to dismiss', 'easing for this', 'stagger the'. Not for design systems; use /ai-design instead. Not for visual art; use /ai-visual instead. Not for testing animation code; use /ai-test instead.
- ai-autopilotDelivers large multi-concern specs and backlog runs autonomously: decomposes specs into sub-specs (or normalizes work items into a backlog DAG), deep-plans with parallel agents, builds a dependency DAG, implements in waves, runs a single final quality loop with one bounded quality-remediation pass (verify+guard+review on full changeset), delivers via PR. Trigger for 'implement spec-NNN end to end', 'autopilot this', 'autonomous delivery', 'decompose and ship', 'run the backlog', 'execute these GitHub issues', 'process the sprint backlog'. Invocation is the approval gate. Not for small or single-concern tasks; use /ai-build instead. Not for ambiguous requirements; use /ai-brainstorm first.
- ai-boardOperates the project board (GitHub Projects v2 or Azure DevOps): discovers configuration after install (fields, state mappings, process templates) and syncs work-item state at lifecycle transitions. Trigger for 'set up the board', 'configure our ADO board', 'discover board fields', 'move this issue to in-review', 'update the board', 'mark as in progress', 'sync the work item state'. Two subcommands: `discover` (post-install configuration write) and `sync` (lifecycle state transitions). Auto-invoked via `sync` by /ai-brainstorm, /ai-build, and /ai-pr; fail-open. Not for backlog execution; use /ai-autopilot --backlog instead.
- ai-brainstormForces rigorous design interrogation BEFORE any code: explores approaches, surfaces ambiguity, gathers evidence, produces an approved spec that becomes the contract for /ai-plan. Trigger for 'lets add X', 'how should we handle Y', 'whats the best approach', 'I am thinking about', 'what should we build for'. Not for existing approved specs; use /ai-plan instead. Not for execution; use /ai-build instead.
- ai-branch-cleanupCleans branches safely: switches to the default branch, prunes merged and squash-merged branches, syncs to remote, sweeps stale specs, rotates `.ai-engineering/runtime/` per retention policy. Trigger for 'tidy up', 'tidy branches', 'sync to main', 'delete old branches', 'start fresh', 'rotate runtime'. Auto-invoked by /ai-pr after merge. Not for committing changes; use /ai-commit instead. Not for code-level dead-code removal; use /ai-simplify instead.
- ai-buildCanonical implementation gateway: reads approved plan.md, resolves stack from manifest, deterministic-routes each task to its adapter, dispatches the build agent in an isolated worktree, runs TDD self-validation per task, then a single final quality loop with one bounded quality-remediation pass on the full changeset before /ai-pr. Trigger for 'go', 'start building', 'execute the plan', 'implement it', 'lets do this', 'build the plan', 'resume', 'continue'. Not without an approved plan; run /ai-plan first. Not for multi-concern specs needing decomposition; use /ai-autopilot instead. Not for a single function or subcomponent; use /ai-code.
- ai-codeWrites production code that satisfies stack-context standards on the first pass: interface-first design, backward-compatibility checks, lightweight self-review. Trigger for 'implement this', 'write the code for', 'add X to Y', 'build this function', 'make this work'. Not for tests; use /ai-test instead. Not for debugging; use /ai-debug instead. Not for refactoring; use /ai-simplify instead. Not for executing an approved plan end-to-end; use /ai-build (the gateway).
- ai-commitRuns the governed commit pipeline: auto-branches from protected, stages selectively, formats and lints, scans for secrets, gates docs, composes a conventional message, pushes. Trigger for 'commit my changes', 'save my work', 'push this to remote', 'stage these files', 'ship it'. Not for opening a PR; use /ai-pr instead. Not for branch hygiene; use /ai-branch-cleanup instead.