legacy-code-safety
$
npx mdskill add TheBushidoCollective/han/legacy-code-safetySafeguard changes to untested code by enforcing characterization testing before modification.
- Mitigates risks associated with altering undocumented or untested source files.
- Utilizes file system interaction tools like Read, Write, and Edit for code manipulation.
- Guides the agent through the RGR workflow to establish behavioral baselines.
- Outputs structured advice emphasizing the necessity of adding tests prior to any code change.
SKILL.md
.github/skills/legacy-code-safetyView on GitHub ↗
---
name: legacy-code-safety
user-invocable: false
description: Use when modifying, removing, or refactoring code that lacks test coverage. Emphasizes the danger of untested changes and the RGR workflow to add characterization tests before modifications.
allowed-tools:
- Read
- Write
- Edit
- Bash
- Grep
- Glob
---
# Legacy Code Safety
> "Changing untested code is like performing surgery blindfolded."
## DANGER: The Risk of Modifying Untested Code
**Modifying code without tests is one of the HIGHEST-RISK activities in software development.**
When you change untested code:
- You CANNOT know what behaviors you're breaking
- You have NO safety net to catch mistakes
- You're relying on hope instead of evidence
- You're creating technical debt on top of technical debt
**In dynamically-typed languages (Python, JavaScript, Ruby, PHP), this risk is EXPONENTIALLY higher:**
- No compiler to catch type errors
- No static analysis to verify correctness
- Tests are your ONLY safety mechanism
- Runtime is when you discover your mistakes
## The Golden Rule
**NEVER modify untested code without adding tests first.**
If code lacks tests:
1. STOP
2. Add characterization tests
3. Verify tests pass
4. THEN make your changes
## The RGR Workflow for Legacy Code
This is NOT standard TDD. This is **characterization testing** - capturing current behavior before changing it.
### Standard TDD (New Features)
1. RED: Write failing test for NEW behavior
2. GREEN: Implement to make test pass
3. REFACTOR: Clean up implementation
### Legacy Code RGR (Existing Code)
1. RED: Write test that SHOULD pass (captures current behavior)
2. GREEN: Verify it passes (confirms you captured reality correctly)
3. REFACTOR: NOW you can safely modify with a safety net
### The Critical Difference
**TDD:** You know what the behavior SHOULD be, write test for ideal
**Legacy RGR:** You DON'T know all behaviors, capture what EXISTS
### Why "RED" for Passing Tests?
The "RED" step means "you're in the red zone" - working with dangerous untested code. Even though you expect the test to pass, you're at risk until you have that passing test as proof.
## Characterization Testing Process
### Step 1: Identify All Behaviors
Before writing tests, understand what the code currently does:
```typescript
// Legacy function - no tests
function calculateShipping(order) {
let cost = 9.99
if (order.total > 100) {
cost = 0
}
if (order.weight > 50) {
cost += 25
}
if (order.destination === 'HI' || order.destination === 'AK') {
cost += 15
}
return cost
}
```
**Questions to answer:**
- What inputs does it accept?
- What outputs does it produce?
- What are the edge cases?
- Are there side effects?
- What happens with invalid input?
### Step 2: Write Tests for Current Behavior
**Document what EXISTS, not what should exist:**
```typescript
describe('calculateShipping (characterization)', () => {
// Capture base case
test('standard shipping', () => {
const order = { total: 50, weight: 10, destination: 'CA' }
expect(calculateShipping(order)).toBe(9.99)
})
// Capture free shipping threshold
test('free shipping over $100', () => {
const order = { total: 150, weight: 10, destination: 'CA' }
expect(calculateShipping(order)).toBe(0)
})
// Capture weight surcharge
test('heavy package surcharge', () => {
const order = { total: 50, weight: 75, destination: 'CA' }
expect(calculateShipping(order)).toBe(34.99) // 9.99 + 25
})
// Capture regional surcharge
test('Hawaii surcharge', () => {
const order = { total: 50, weight: 10, destination: 'HI' }
expect(calculateShipping(order)).toBe(24.99) // 9.99 + 15
})
// Discovered edge case: free shipping + weight
test('free shipping with heavy package', () => {
const order = { total: 150, weight: 75, destination: 'CA' }
expect(calculateShipping(order)).toBe(25) // 0 + 25 (is this right?)
})
// Discovered edge case: all conditions
test('all surcharges with free shipping threshold', () => {
const order = { total: 150, weight: 75, destination: 'HI' }
expect(calculateShipping(order)).toBe(40) // 0 + 25 + 15
})
})
```
### Step 3: Verify Tests Pass
**These tests MUST pass immediately:**
```bash
npm test
```
If they don't pass:
- You misunderstood the current behavior
- Fix the test, not the code
- The test should document reality
### Step 4: NOW You Can Refactor
**With tests in place, you have a safety net:**
```typescript
// NOW we can safely refactor
function calculateShipping(order: Order): number {
const BASE_COST = 9.99
const FREE_SHIPPING_THRESHOLD = 100
const HEAVY_PACKAGE_WEIGHT = 50
const HEAVY_PACKAGE_SURCHARGE = 25
const REMOTE_DESTINATIONS = ['HI', 'AK']
const REMOTE_SURCHARGE = 15
let cost = order.total > FREE_SHIPPING_THRESHOLD ? 0 : BASE_COST
if (order.weight > HEAVY_PACKAGE_WEIGHT) {
cost += HEAVY_PACKAGE_SURCHARGE
}
if (REMOTE_DESTINATIONS.includes(order.destination)) {
cost += REMOTE_SURCHARGE
}
return cost
}
```
**Run tests again:**
```bash
npm test # All tests still pass
```
You've improved the code WITHOUT breaking existing behavior.
## When This Skill Applies
Use this skill when:
- [ ] Deleting code that has no tests
- [ ] Modifying functions without test coverage
- [ ] Refactoring untested modules
- [ ] Changing behavior in legacy systems
- [ ] Test coverage is below 80% for code you're touching
- [ ] You inherit code from previous developers
- [ ] You're working in a codebase with poor test discipline
**Even if the change seems "simple", if there are no tests, use this workflow.**
## Language-Specific Risk Levels
### HIGH RISK: Dynamic/Weakly-Typed Languages
**Python, JavaScript, Ruby, PHP:**
- NO compile-time type checking
- NO static analysis of type correctness
- Tests are your ONLY safety net
- Runtime is when errors appear
- User discovers bugs, not compiler
**Extra caution required:**
- Write MORE tests
- Test edge cases thoroughly
- Check for None/null/undefined
- Verify types explicitly
- Add type hints/annotations when possible
### MEDIUM RISK: Static/Strongly-Typed Languages
**Go, Rust, TypeScript (strict mode), Java:**
- Type system catches some errors
- Compiler validates correctness
- Still need tests for BEHAVIOR
- Types don't test business logic
**Type safety helps, but:**
- Types verify structure, not logic
- Tests still required for correctness
- Don't rely solely on type system
## The Strangler Fig Pattern
For large legacy systems, don't try to test everything at once:
### What Is It?
Named after the strangler fig vine that grows around a tree:
1. Vine grows alongside tree
2. Eventually replaces tree entirely
3. Original tree can be removed
Applied to code:
1. Wrap legacy code with new interface
2. Add tests to the wrapper
3. Gradually move logic to new code
4. Eventually remove legacy code
### Implementation
**Phase 1: Wrap**
```typescript
// Legacy code (untested, scary)
function legacyCalculatePrice(item) {
// ... 500 lines of spaghetti ...
}
// New wrapper (tested)
function calculatePrice(item: Item): number {
// Add validation and tests here
validateItem(item)
// Call legacy for now
return legacyCalculatePrice(item)
}
```
**Phase 2: Test the Boundary**
```typescript
describe('calculatePrice wrapper', () => {
test('validates input', () => {
expect(() => calculatePrice(null)).toThrow()
})
test('handles normal items', () => {
const item = { id: 1, price: 10 }
expect(calculatePrice(item)).toBe(10)
})
// Characterization tests for legacy behavior
test('matches legacy for discounts', () => {
const item = { id: 1, price: 10, discount: 0.2 }
expect(calculatePrice(item)).toBe(8)
})
})
```
**Phase 3: Replace Internals**
```typescript
function calculatePrice(item: Item): number {
validateItem(item)
// New implementation (tested, clean)
const basePrice = item.price
const discount = item.discount ?? 0
return basePrice * (1 - discount)
// Legacy code removed!
}
```
**Phase 4: Expand**
Now that one function is safe, repeat for next function.
### Benefits
- Incremental progress
- Always have working system
- Tests added gradually
- Risk minimized
- Can stop anytime
### When to Use
- Large untested codebase
- Can't stop for months to add tests
- Need to deliver features while improving
- High-risk production system
## Pre-Modification Checklist
Before touching ANY untested code:
- [ ] Identified all entry points (who calls this code?)
- [ ] Identified all exit points (what does this code call?)
- [ ] Understood current behavior (what does it do now?)
- [ ] Written characterization tests for current behavior
- [ ] All characterization tests pass
- [ ] Documented any surprising behaviors found
- [ ] Considered using Strangler Fig for large changes
**ONLY AFTER ALL CHECKS PASS:**
- [ ] Make your intended changes
- [ ] Run tests (verify behavior preserved or intentionally changed)
- [ ] Add tests for NEW behavior if applicable
- [ ] Document breaking changes if any
## Red Flags
Stop immediately if:
- "I'll just make a quick change, no need for tests"
- "It's a small change, what could go wrong?"
- "I don't have time to write tests"
- "The code is too complex to test"
- "I'll add tests later" (you won't)
**These are rationalizations for dangerous behavior.**
## Common Objections (and Responses)
### "But tests will take longer than the change!"
**Response:** The initial change is fast. Finding and fixing the bugs you introduced takes much longer. Tests save time.
### "The code is too complex to test!"
**Response:** If it's too complex to test, it's too complex to change safely. Simplify first, or use Strangler Fig pattern.
### "We need to ship this feature NOW!"
**Response:** Shipping a broken feature is slower than shipping a tested feature late. Technical debt compounds.
### "This code has worked fine without tests for years!"
**Response:** Until now. The moment you change it, that safety is gone. Past stability doesn't predict future stability.
### "I understand the code, I won't break it!"
**Response:** Confidence is not a substitute for evidence. Even experts make mistakes.
## Success Metrics
How to know you've done this right:
**Green flags:**
- All characterization tests pass before changes
- All tests still pass after changes
- New behavior has new tests
- Test coverage increased
- Code is more readable
- Confidence is high
**Red flags:**
- Tests fail after changes (and you don't know why)
- Skipped edge cases "to save time"
- Changed behavior without updating tests
- Test coverage decreased or stagnant
- More confused than before
## Integration with Other Skills
Use these related skills:
- **test-driven-development**: For new behavior, use TDD after characterization tests exist
- **refactoring**: With tests in place, refactor safely
- **boy-scout-rule**: While adding tests, improve code quality
- **proof-of-work**: Document test results as evidence
- **debugging**: When characterization tests reveal unexpected behavior
## Tools and Techniques
### Code Coverage Tools
Use coverage tools to identify untested code:
```bash
# JavaScript/TypeScript
npm test -- --coverage
# Python
pytest --cov=mypackage
# Ruby
bundle exec rspec --require spec_helper
# Go
go test -cover ./...
```
**Target coverage:**
- Critical paths: 100%
- Business logic: 90%+
- Overall codebase: 80%+
### Mutation Testing
Verify tests actually catch bugs:
```bash
# JavaScript
npm install -D stryker
# Python
pip install mutmut
```
Mutation testing changes your code slightly and verifies tests fail. If tests still pass after mutation, they're not effective.
### Approval Testing
For complex outputs (HTML, JSON, reports):
```typescript
test('generates invoice', () => {
const invoice = generateInvoice(order)
expect(invoice).toMatchSnapshot()
})
```
First run captures current output. Future runs verify it hasn't changed.
## Real-World Example
### Scenario
Legacy payment processor, 2000 lines, no tests, needs to add new payment method.
### Step 1: Identify Scope
Don't test everything - test what you'll touch:
```typescript
// Will modify this
function processPayment(payment, method) {
// ... many lines ...
}
// Will NOT modify these
function logTransaction(transaction) { }
function sendEmail(user, content) { }
```
### Step 2: Add Characterization Tests
```typescript
describe('processPayment (characterization)', () => {
test('processes credit card payment', () => {
const payment = { amount: 100, currency: 'USD' }
const method = { type: 'credit_card', number: '4111...' }
const result = processPayment(payment, method)
expect(result.success).toBe(true)
expect(result.transactionId).toBeDefined()
expect(result.amount).toBe(100)
})
// More tests for existing payment methods...
})
```
### Step 3: Tests Pass
```bash
npm test
# ✅ All tests passing
```
### Step 4: Add New Feature
```typescript
function processPayment(payment, method) {
// Existing code preserved
if (method.type === 'credit_card') {
return processCreditCard(payment, method)
}
// NEW: Add cryptocurrency support
if (method.type === 'crypto') {
return processCrypto(payment, method)
}
throw new Error(`Unknown payment method: ${method.type}`)
}
```
### Step 5: Add Tests for New Behavior
```typescript
test('processes cryptocurrency payment', () => {
const payment = { amount: 100, currency: 'USD' }
const method = { type: 'crypto', address: '0x123...' }
const result = processPayment(payment, method)
expect(result.success).toBe(true)
expect(result.transactionId).toBeDefined()
})
```
### Step 6: All Tests Pass
```bash
npm test
# ✅ All tests passing (old + new)
```
### Result
- New feature added safely
- Existing behavior preserved
- Tests ensure no regression
- Future changes safer
## Remember
1. **Untested code is dangerous code** - especially in dynamic languages
2. **Characterize before changing** - Capture current behavior first
3. **RGR workflow for legacy** - Different from TDD, specific to untested code
4. **Strangler Fig for large systems** - Gradual improvement beats big bang
5. **Tests are proof** - Confidence without evidence is just hope
6. **No shortcuts** - "Quick changes" create slow bugs
**Legacy code is not an excuse for reckless changes. It's a reason for EXTRA caution.**