TDD
Purpose: How to do TDD - Red-Green-Refactor cycle, AAA structure, best practices, anti-patterns.
Test-Driven Development (TDD)¶
TDD is a development methodology where tests are written before implementation code, driving design and ensuring testability from the start.
The Red-Green-Refactor Cycle¶
┌─────────────┐
│ 1. RED │ Write a failing test
│ │ (test what should happen)
└─────┬───────┘
│
▼
┌─────────────┐
│ 2. GREEN │ Write minimal code to pass
│ │ (make it work)
└─────┬───────┘
│
▼
┌─────────────┐
│ 3. REFACTOR│ Improve code quality
│ │ (make it clean)
└─────┬───────┘
│
└──────> Repeat
Core Practices¶
1. Write Tests First¶
Why: Enforces modular, decoupled code with clear interfaces
# RED: Write failing test first
def test_user_service_creates_new_user():
service = UserService()
user = service.create(name="Alice", email="alice@example.com")
assert user.id is not None
assert user.name == "Alice"
Then implement minimal code to pass.
2. Use Arrange-Act-Assert (AAA)¶
Structure every test in three phases:
def test_order_processor_calculates_total():
# ARRANGE - Set up test data
items = [Item(price=10.00, qty=2), Item(price=5.00, qty=1)]
processor = OrderProcessor()
# ACT - Execute the behavior
total = processor.calculate_total(items)
# ASSERT - Verify the outcome
assert total == 25.00
3. Keep Tests Atomic and Isolated¶
One behavior per test: test_calculator_adds_numbers() tests
addition only, not subtraction, multiplication, etc.
4. Test Edge Cases Before Happy Paths¶
Cover failure modes first (empty input, malformed data), then success cases.
5. Descriptive Test Names¶
Name describes behavior: test_user_service_returns_404_for_unknown_user()
not test_service_response().
Benefits of TDD¶
Design quality: Forces modular, testable code with clear interfaces
Fast feedback: Catches bugs immediately while context is fresh
Refactoring confidence: Tests enable safe code improvements
Living documentation: Tests describe how the system behaves
Defect reduction: Studies show 40-90% reduction in defect density
TDD Anti-Patterns to Avoid¶
Testing implementation details:
# BAD - Tests internal structure
def test_service_uses_specific_library():
assert isinstance(service._internal_client, SomeLibrary)
# GOOD - Tests behavior
def test_service_fetches_data():
assert service.fetch("key") == expected_value
Unspec’d mocks on third-party types: MagicMock() silently accepts any
attribute, masking API drift bugs until runtime.
# BAD - mock.data always succeeds even if AgentRunResult has no .data
mock_result = MagicMock()
result = await run_manager(...) # passes, but crashes at runtime
# GOOD - spec= constrains to real interface
mock_result = MagicMock(spec=AgentRunResult)
mock_result.output = MagicMock() # must explicitly set dataclass fields
Overly complex tests: If test setup is harder to understand than the code, simplify. Avoid excessive mocking.
Chasing 100% coverage: Aim for meaningful behavior coverage, not line coverage percentage.
Stale fixture patches: When source code changes (renamed imports, restructured modules), update or delete tests that patch the old interface. Broken fixtures don’t clean up properly, leaving shared state dirty and causing unrelated tests to fail later in the suite.
Low-value patterns: See testing-strategy.md → “Patterns to Remove”
for full list.
Running Tests¶
See CONTRIBUTING.md for all make recipes and test commands.
For TDD iterations, run specific tests with
uv run pytest tests/test_module.py::test_function.
When to Use TDD¶
Use TDD for:
- Business logic (calculations, algorithms, rules)
- Data transformations (model conversions, parsing)
- Edge case handling (empty inputs, nulls, boundaries)
- API endpoints (request/response validation)
Consider alternatives for:
- Simple CRUD operations
- UI layouts (use visual testing)
- Exploratory prototypes (add tests after)