Testing Migrations¶
Testing your migrations is crucial for ensuring data integrity and preventing production issues. This guide covers testing strategies, best practices, and integration with testing frameworks.
Why Test Migrations?¶
Migrations transform your data. Bugs in migrations can:
- Cause data loss
- Create invalid data that fails validation
- Produce incorrect results silently
- Break production systems during deployment
Testing migrations before production is not optional.
Basic Testing¶
pyrmute includes built-in testing utilities:
from pyrmute import ModelManager, ModelData
manager = ModelManager()
# Define your models and migrations
@manager.model("User", "1.0.0")
class UserV1(BaseModel):
name: str
@manager.model("User", "2.0.0")
class UserV2(BaseModel):
first_name: str
last_name: str
@manager.migration("User", "1.0.0", "2.0.0")
def split_name(data: ModelData) -> ModelData:
parts = data["name"].split(" ", 1)
return {
"first_name": parts[0],
"last_name": parts[1] if len(parts) > 1 else ""
}
# Test the migration
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
(
{"name": "Alice Smith"},
{"first_name": "Alice", "last_name": "Smith"}
),
(
{"name": "Bob"},
{"first_name": "Bob", "last_name": ""}
),
]
)
# Check results
if results.all_passed:
print("✓ All tests passed!")
else:
print(f"✗ {len(results.failures)} test(s) failed")
for failure in results.failures:
print(failure)
Test Case Formats¶
Tuple Format (Simple)¶
Use tuples for straightforward tests:
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
# (source_data, expected_output)
({"name": "Alice"}, {"first_name": "Alice", "last_name": ""}),
({"name": "Bob Jones"}, {"first_name": "Bob", "last_name": "Jones"}),
]
)
MigrationTestCase (Descriptive)¶
Use MigrationTestCase for better documentation:
from pyrmute import MigrationTestCase
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
MigrationTestCase(
source={"name": "Alice"},
target={"first_name": "Alice", "last_name": ""},
description="Single name should leave last_name empty"
),
MigrationTestCase(
source={"name": "Bob Jones"},
target={"first_name": "Bob", "last_name": "Jones"},
description="Full name should split correctly"
),
MigrationTestCase(
source={"name": "Maria de la Cruz"},
target={"first_name": "Maria", "last_name": "de la Cruz"},
description="Multiple spaces should split on first space only"
),
]
)
# Better error messages with descriptions
if not results.all_passed:
for failure in results.failures:
print(failure)
# Prints: ✗ Test failed - Multiple spaces should split on first space only
Testing Without Expected Output¶
Sometimes you just want to verify migrations don't crash:
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
MigrationTestCase(
source={"name": "Alice"},
target=None, # Don't check output, just verify no errors
description="Should handle basic case without errors"
),
]
)
This is useful for:
- Smoke testing complex migrations
- Testing that edge cases don't crash
- Initial migration development
Integration with Test Frameworks¶
pytest¶
import pytest
from pyrmute import ModelManager
manager = ModelManager()
# ... define models and migrations ...
def test_user_migration_basic() -> None:
"""Test basic user migration scenarios."""
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
({"name": "Alice"}, {"first_name": "Alice", "last_name": ""}),
({"name": "Bob Smith"}, {"first_name": "Bob", "last_name": "Smith"}),
]
)
results.assert_all_passed()
def test_user_migration_edge_cases() -> None:
"""Test edge cases in user migration."""
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
({"name": ""}, {"first_name": "", "last_name": ""}),
({"name": " "}, {"first_name": "", "last_name": ""}),
({"name": "X"}, {"first_name": "X", "last_name": ""}),
]
)
results.assert_all_passed()
def test_user_migration_preserves_fields() -> None:
"""Test that migration preserves other fields."""
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
(
{"name": "Alice", "id": 123, "email": "alice@example.com"},
{
"first_name": "Alice",
"last_name": "",
"id": 123,
"email": "alice@example.com"
}
),
]
)
results.assert_all_passed()
unittest¶
import unittest
from pyrmute import ModelManager
manager = ModelManager()
# ... define models and migrations ...
class TestUserMigration(unittest.TestCase):
def test_basic_migration(self) -> None:
"""Test basic user migration."""
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
({"name": "Alice"}, {"first_name": "Alice", "last_name": ""}),
]
)
results.assert_all_passed()
def test_edge_cases(self) -> None:
"""Test edge cases."""
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
({"name": ""}, {"first_name": "", "last_name": ""}),
]
)
results.assert_all_passed()
if __name__ == "__main__":
unittest.main()
Test Coverage Guidelines¶
Essential Test Cases¶
Every migration should test:
- Happy path - Normal, expected data
- Edge cases - Empty strings, None values, extreme values
- Missing fields - Optional fields not present
- Invalid data - Data that might exist in production
- Boundary conditions - Min/max values, empty collections
Example: Comprehensive Test Suite¶
def test_user_migration_comprehensive() -> None:
"""Comprehensive test suite for user migration."""
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
# Happy path
MigrationTestCase(
source={"name": "Alice Smith"},
target={"first_name": "Alice", "last_name": "Smith"},
description="Normal full name"
),
# Single name
MigrationTestCase(
source={"name": "Alice"},
target={"first_name": "Alice", "last_name": ""},
description="Single name only"
),
# Multiple spaces
MigrationTestCase(
source={"name": "Alice Mary Smith"},
target={"first_name": "Alice", "last_name": "Mary Smith"},
description="Multiple word last name"
),
# Empty name
MigrationTestCase(
source={"name": ""},
target={"first_name": "", "last_name": ""},
description="Empty name string"
),
# Whitespace only
MigrationTestCase(
source={"name": " "},
target={"first_name": "", "last_name": ""},
description="Whitespace only name"
),
# Missing name field
MigrationTestCase(
source={},
target={"first_name": "", "last_name": ""},
description="Missing name field"
),
# Extra whitespace
MigrationTestCase(
source={"name": " Alice Smith "},
target={"first_name": "Alice", "last_name": "Smith"},
description="Extra whitespace around name"
),
# Special characters
MigrationTestCase(
source={"name": "O'Brien"},
target={"first_name": "O'Brien", "last_name": ""},
description="Name with apostrophe"
),
# Unicode characters
MigrationTestCase(
source={"name": "José García"},
target={"first_name": "José", "last_name": "García"},
description="Unicode characters in name"
),
# Preserves other fields
MigrationTestCase(
source={"name": "Alice", "id": 123, "active": True},
target={
"first_name": "Alice",
"last_name": "",
"id": 123,
"active": True
},
description="Preserves additional fields"
),
]
)
results.assert_all_passed()
Testing Migration Chains¶
Test migrations across multiple versions:
def test_migration_chain() -> None:
"""Test migrating across multiple versions."""
# v1 -> v2 -> v3
results_v1_to_v2 = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
({"name": "Alice"}, {"first_name": "Alice", "last_name": ""}),
]
)
results_v1_to_v2.assert_all_passed()
results_v2_to_v3 = manager.test_migration(
"User",
"2.0.0",
"3.0.0",
test_cases=[
(
{"first_name": "Alice", "last_name": "Smith"},
{"full_name": "Alice Smith", "email": "unknown@example.com"}
),
]
)
results_v2_to_v3.assert_all_passed()
# Test full chain v1 -> v3
results_v1_to_v3 = manager.test_migration(
"User",
"1.0.0",
"3.0.0",
test_cases=[
(
{"name": "Alice Smith"},
{"full_name": "Alice Smith", "email": "unknown@example.com"}
),
]
)
results_v1_to_v3.assert_all_passed()
Testing with Real Data¶
Use anonymized production data for testing:
import json
from pathlib import Path
def test_with_production_data() -> None:
"""Test migration with real production data samples."""
# Load anonymized production samples
samples_path = Path("tests/fixtures/user_samples_v1.json")
with open(samples_path) as f:
production_samples = json.load(f)
test_cases = []
for sample in production_samples:
# You might not have expected output for all samples
test_cases.append(
MigrationTestCase(
source=sample,
target=None, # Just verify no crashes
description=f"Production sample: user_id={sample.get('id')}"
)
)
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=test_cases
)
results.assert_all_passed()
Property-Based Testing¶
Use hypothesis for property-based testing:
import pytest
from pyrmute import ModelData
from hypothesis import given, strategies as st
@given(st.text())
def test_name_migration_always_splits_on_first_space(name: str) -> None:
"""Test that name splitting is consistent."""
result = manager.migrate_data(
{"name": name},
"User",
"1.0.0",
"2.0.0"
)
# Property: reconstructing name should give original (minus extra spaces)
if name.strip():
reconstructed = f"{result['first_name']} {result['last_name']}".strip()
assert reconstructed == " ".join(name.split())
@given(st.dictionaries(
keys=st.text(min_size=1),
values=st.one_of(st.text(), st.integers(), st.booleans())
))
def test_migration_preserves_extra_fields(data: ModelData) -> None:
"""Test that migrations preserve fields they don't touch."""
# Add name field for migration
data["name"] = "Test User"
result = manager.migrate_data(data, "User", "1.0.0", "2.0.0")
# All original fields (except name) should be preserved
for key, value in data.items():
if key != "name":
assert result.get(key) == value
Snapshot Testing¶
Compare migration outputs to saved snapshots:
import json
from pathlib import Path
def test_migration_snapshot() -> None:
"""Test that migration output matches saved snapshot."""
# Input data
input_data = {"name": "Alice Smith", "id": 123}
# Run migration
result = manager.migrate_data(input_data, "User", "1.0.0", "2.0.0")
# Load or create snapshot
snapshot_path = Path("tests/snapshots/user_v1_to_v2.json")
if snapshot_path.exists():
with open(snapshot_path) as f:
expected = json.load(f)
assert result == expected, "Migration output changed"
else:
# Create snapshot on first run
snapshot_path.parent.mkdir(parents=True, exist_ok=True)
with open(snapshot_path, "w") as f:
json.dump(result, f, indent=2)
pytest.skip("Created new snapshot")
Testing Error Cases¶
Test that migrations fail appropriately:
import pytest
from pyrmute import MigrationError
def test_migration_fails_on_invalid_data() -> None:
"""Test that migration raises appropriate errors."""
with pytest.raises(MigrationError) as exc_info:
manager.migrate(
{"invalid": "data"}, # Missing required 'name' field
"User",
"1.0.0",
"2.0.0"
)
assert "name" in str(exc_info.value).lower()
def test_migration_handles_none_gracefully() -> None:
"""Test that None values are handled correctly."""
results = manager.test_migration(
"User",
"1.0.0",
"2.0.0",
test_cases=[
(
{"name": None},
{"first_name": "", "last_name": ""} # Or whatever your handling is
),
]
)
results.assert_all_passed()
Performance Testing¶
Test migration performance with large datasets:
import time
def test_migration_performance() -> None:
"""Test that migration performs adequately."""
# Generate test data
test_data = [{"name": f"User {i}"} for i in range(10000)]
start = time.time()
results = manager.migrate_batch(
test_data,
"User",
"1.0.0",
"2.0.0"
)
duration = time.time() - start
# Should process 10k records in reasonable time
assert duration < 5.0, f"Migration too slow: {duration:.2f}s"
assert len(results) == 10000
# Calculate throughput
throughput = len(test_data) / duration
print(f"Throughput: {throughput:.0f} records/second")
Testing Best Practices¶
1. Test Before Production¶
Never deploy untested migrations:
# In your CI/CD pipeline
def test_all_migrations() -> None:
"""Test all registered migrations."""
for model_name in manager.list_models():
versions = manager.list_versions(model_name)
for i in range(len(versions) - 1):
from_version = versions[i]
to_version = versions[i + 1]
# Verify migration path exists
assert manager.has_migration_path(
model_name,
from_version,
to_version
), f"No migration from {from_version} to {to_version}"
2. Use Realistic Data¶
Don't just test happy paths:
# ❌ BAD - Only happy path
test_cases = [
({"name": "Alice Smith"}, {"first_name": "Alice", "last_name": "Smith"})
]
# ✅ GOOD - Multiple scenarios
test_cases = [
({"name": "Alice Smith"}, {"first_name": "Alice", "last_name": "Smith"}),
({"name": "Alice"}, {"first_name": "Alice", "last_name": ""}),
({"name": ""}, {"first_name": "", "last_name": ""}),
({}, {"first_name": "", "last_name": ""}),
({"name": None}, {"first_name": "", "last_name": ""}),
]
3. Test Idempotency¶
Verify migrations can be run multiple times safely:
def test_migration_idempotency() -> None:
"""Test that running migration twice is safe."""
data = {"name": "Alice Smith"}
# First migration
result1 = manager.migrate_data(data, "User", "1.0.0", "2.0.0")
# Second migration of same data
result2 = manager.migrate_data(data, "User", "1.0.0", "2.0.0")
# Should produce identical results
assert result1 == result2
4. Version Your Test Data¶
Keep test data in version control:
tests/
├── data/
│ ├── user_v1_samples.json
│ ├── user_v2_samples.json
│ └── config_v1_samples.json
└── test_migrations.py
5. Document Test Scenarios¶
"""
Test scenarios for User v1.0.0 -> v2.0.0 migration:
1. Standard full name (first last)
2. Single name only
3. Multiple word last name
4. Empty/missing name
5. Unicode characters
6. Special characters (apostrophes, hyphens)
7. Excessive whitespace
8. Field preservation (ensure other fields not lost)
"""
Debugging Failed Tests¶
When tests fail, use the detailed output:
results = manager.test_migration("User", "1.0.0", "2.0.0", test_cases)
if not results.all_passed:
for failure in results.failures:
print(f"\n{failure}")
# Prints:
# ✗ Test failed - Single name should leave last_name empty
# Source: {'name': 'Alice'}
# Expected: {'first_name': 'Alice', 'last_name': ''}
# Actual: {'first_name': 'Alice', 'last_name': None}
# Error: Output mismatch
You can also inspect individual results:
for result in results:
print(f"Test: {result.test_case.description}")
print(f"Passed: {result.passed}")
if not result.passed:
print(f"Source: {result.test_case.source}")
print(f"Expected: {result.test_case.target}")
print(f"Actual: {result.actual}")
print(f"Error: {result.error}")
Next Steps¶
Now that you understand migration testing:
Continue learning:
- Batch Processing - Test large-scale migration scenarios
- Nested Models - Test nested model migrations
- Discriminated Unions - Test polymorphic type migrations
Improve your migrations:
- Writing Migrations - Write testable migration functions
API Reference:
- Migration Testing API - Complete details of migration testing objects
ModelManagerAPI - CompleteModelManagerdetails- Exceptions - Exceptions pyrmute raises