hypothesis
$
npx mdskill add anam-org/metaxy/hypothesisProperty-based testing framework that generates test cases automatically, finds minimal failing examples through shrinking, and verifies invariants.
SKILL.md
.github/skills/hypothesisView on GitHub ↗
---
name: hypothesis
description: Use Hypothesis for property-based testing to automatically generate comprehensive test cases, find edge cases, and write more robust tests with minimal example shrinking. Includes Polars parametric testing integration.
---
# Hypothesis - Property-Based Testing
Property-based testing framework that generates test cases automatically, finds minimal failing examples through shrinking, and verifies invariants.
**Official Docs:** https://hypothesis.readthedocs.io/en/latest/
**Key Features:**
- Automatic test data generation from strategies
- Minimal failing example shrinking
- Stateful testing with rule-based state machines
- pytest integration
- Deterministic reproducibility
## Quick Start
```python
from hypothesis import given
from hypothesis import strategies as st
@given(st.integers())
def test_property(x):
"""Test properties that should always hold"""
assert abs(x) >= 0
@given(st.lists(st.integers()))
def test_list_property(lst):
sorted_lst = sorted(lst)
assert len(sorted_lst) == len(lst)
# Check monotonic property
for i in range(len(sorted_lst) - 1):
assert sorted_lst[i] <= sorted_lst[i + 1]
```
## Strategies
**Full reference:** https://hypothesis.readthedocs.io/en/latest/data.html
Common strategies:
- Primitives: `st.integers()`, `st.floats()`, `st.text()`, `st.booleans()`
- Collections: `st.lists()`, `st.dictionaries()`, `st.tuples()`, `st.sets()`
- Dates/Times: `st.dates()`, `st.datetimes()`, `st.timedeltas()`
- Combinators: `st.one_of()`, `st.sampled_from()`, `st.recursive()`
- Type-based: `st.from_type(MyClass)`
### Composite Strategies
```python
from hypothesis import strategies as st
from hypothesis.strategies import composite
@composite
def user_strategy(draw):
age = draw(st.integers(min_value=18, max_value=100))
name = draw(st.text(min_size=1))
return {"name": name, "age": age, "is_adult": age >= 18}
@given(user_strategy())
def test_user(user):
assert user["is_adult"] == (user["age"] >= 18)
```
### Strategy Combinators
```python
st.integers().filter(lambda x: x % 2 == 0) # Filter
st.integers().map(str) # Transform
st.one_of(st.integers(), st.text()) # Choose between strategies
st.sampled_from([1, 2, 3, 4, 5]) # Pick from collection
st.from_type(MyClass) # Infer from type hints
st.builds(MyClass, arg1=st.integers()) # Build instances
```
## Settings
```python
from hypothesis import given, settings
from hypothesis import strategies as st
@given(st.integers())
@settings(
max_examples=1000, # Default: 100
deadline=None, # Remove time limit
derandomize=True, # Deterministic ordering
)
def test_example(x):
pass
# Profiles for different environments
settings.register_profile("dev", max_examples=10)
settings.register_profile("ci", max_examples=1000, deadline=None)
# Activate: HYPOTHESIS_PROFILE=ci pytest
```
**Full settings reference:** https://hypothesis.readthedocs.io/en/latest/settings.html
## Helpers
```python
from hypothesis import given, assume, note, example, seed
@given(st.integers(), st.integers())
def test_division(x, y):
assume(y != 0) # Skip invalid cases (prefer .filter() instead)
note(f"Testing {x} / {y}") # Add debug info
assert (x / y) * y == x
@given(st.integers())
@example(0) # Always test specific cases
@seed(12345) # Reproducible run
def test_something(x):
pass
```
## Stateful Testing
For testing complex stateful systems with rule-based state machines.
```python
from hypothesis.stateful import RuleBasedStateMachine, rule, invariant
from hypothesis import strategies as st
class MyStateMachine(RuleBasedStateMachine):
def __init__(self):
super().__init__()
self.data = []
@rule(value=st.integers())
def add(self, value):
self.data.append(value)
@invariant()
def check_invariant(self):
assert isinstance(self.data, list)
TestMachine = MyStateMachine.TestCase
```
**Full stateful testing guide:** https://hypothesis.readthedocs.io/en/latest/stateful.html
## Polars Integration
Polars provides built-in parametric testing strategies for generating DataFrames.
**Official docs:** https://docs.pola.rs/api/python/stable/reference/api/polars.testing.parametric.dataframes.html
```python
from hypothesis import given
import polars as pl
from polars.testing.parametric import dataframes, column
# Generate DataFrames with specific column schemas
@given(
dataframes(
cols=[
column("id", dtype=pl.Int64),
column("name", dtype=pl.String),
column("value", dtype=pl.Float64),
],
min_size=1,
max_size=100,
)
)
def test_dataframe_property(df: pl.DataFrame):
"""Test properties of DataFrame operations"""
assert df.shape[0] >= 1
assert set(df.columns) == {"id", "name", "value"}
assert df["id"].dtype == pl.Int64
# With Narwhals wrapper
import narwhals as nw
@given(dataframes(cols=[column("a", dtype=pl.Int64)]))
def test_narwhals_operation(df: pl.DataFrame):
nw_df = nw.from_native(df)
result = nw_df.select(nw.col("a") * 2)
assert result.shape[0] == nw_df.shape[0]
```
**Key functions:**
- `dataframes()`: Generate DataFrames with specified columns
- `column(name, dtype, ...)`: Define column schemas with constraints
- `series()`: Generate standalone Series
**Column constraints:**
- `null_probability`: Control null value frequency
- `min_size`/`max_size`: Control row count
- `allow_null`: Enable/disable nulls
- `unique`: Generate unique values
- `strategy`: Custom strategy for column values
## Best Practices
- **Use constraints over filters**: `st.integers(min_value=0)` not `st.integers().filter(lambda x: x >= 0)`
- **Test properties, not examples**: Focus on invariants that always hold
- **Combine with `@example()`**: Test specific edge cases explicitly
- **Avoid `assume()` overuse**: Makes tests slow; use filtered strategies
- **Document properties**: Clear docstrings explain what invariant is tested
- **Set size limits**: Always bound collection sizes to prevent memory issues
- **Use `.hypothesis/` in `.gitignore`**: Stores example database locally
## Troubleshooting
Common issues and solutions:
- **HealthCheck failures**: Too many rejected examples → use constrained strategies or `suppress_health_check`
- **Flaky tests**: Non-deterministic code → use `@seed()` or `@settings(derandomize=True)`
- **Slow tests**: Too many examples → reduce `max_examples` or use profiles
- **Deadline exceeded**: Complex operations → increase `deadline` or set to `None`
## Resources
- Main docs: https://hypothesis.readthedocs.io/
- Strategies: https://hypothesis.readthedocs.io/en/latest/data.html
- Stateful testing: https://hypothesis.readthedocs.io/en/latest/stateful.html
- Ghost writer (auto-generate tests): `hypothesis write mymodule.myfunction`
More from anam-org/metaxy
- claude-improve-configSelf-reflect on the current session to identify mistakes and propose improvements to .claude configuration (CLAUDE.md, hooks, skills).
- docs-page-frontmatterWrite YAML front matter for documentation pages with appropriate titles and descriptions for social cards.
- metaxyThis skill should be used when the user asks to "define a feature", "create a BaseFeature class", "track feature versions", "set up metadata store", "field-level lineage", "FieldSpec", "FeatureDep", "run metaxy CLI", "metaxy migrations", "metaxy lock", "lock features", "external features", "multi-environment", "monorepo features", "enable Map datatype", "enable_map_datatype", or needs guidance on metaxy feature definitions, versioning, metadata stores, CLI commands, testing patterns, feature locking, Map datatype configuration, or multi-environment configuration.
- narwhalsEffectively use Narwhals to write dataframe-agnostic code that works seamlessly across multiple Python dataframe libraries. Write correct type annotations for code using Narwhals.
- sybilUse Sybil for testing code examples in documentation and docstrings. Covers pytest integration, parsers, skip directives, and namespace management.
- syrupyUse syrupy for pytest snapshot testing to ensure the immutability of computed results, manage snapshots, customize serialization, and handle complex data structures with built-in matchers and filters.
- tachThis skill should be used when the user asks to "add a tach module", "configure tach layers", "define module boundaries", "set up interfaces", "run tach check", "check module boundaries", "tach sync", "tach show", "deprecate a dependency", "tach-ignore", "unchecked modules", "tach test", "skip tests with tach", "configure tach.toml", "source roots", "forbid circular dependencies", "enforce module boundaries", "set up architectural layers", or "tach init".