How Masking Works
Understand the masking engine’s internals, how transformers replace sensitive data, and how the platform preserves referential integrity during masking.
Overview
Data masking replaces sensitive information with realistic but fake data. The platform’s masking engine reads every row from the source database, applies transformers to specified columns, and writes the masked data to the destination while preserving referential integrity, data types, and business logic.
MASKING Mode
Key Characteristics
-
Preserves row count: Output has same number of rows as input
-
Preserves IDs: Primary keys typically kept unchanged
-
1:1 mapping: Each source row maps to exactly one destination row
-
Schema preserved: Same tables, columns, data types
-
Referential integrity maintained: Foreign keys remain valid
Masking Process
Example: Before and After Masking
| Column | Original Data | Masked Data | Transformation |
|---|---|---|---|
|
1 |
1 |
Preserved (Primary Key) |
|
John |
Alice |
Realistic name generated |
|
Doe |
Smith |
Realistic name generated |
|
Format preserved, content masked |
||
|
+1-555-123-4567 |
+1-555-987-6543 |
Format preserved, digits randomized |
|
2023-01-15 |
2023-01-15 |
Preserved (non-sensitive) |
|
Notice how the masked data:
|
When to Use Masking
Masking is ideal when you need to:
-
Anonymize production data for non-production environments
-
Maintain exact row counts and relationships
-
Preserve business logic and data distributions
-
Comply with data privacy regulations (GDPR, HIPAA, etc.)
|
Comparing Modes: To understand how masking differs from generation and subsetting, see Mode Comparison. |
Common Questions
Is masking reversible?
No. Masking is intentionally irreversible for security. Once data is masked, the original values cannot be recovered. This is a feature, not a limitation - it ensures that masked data cannot be de-anonymized.
| Never mask your only copy of production data. Always mask a copy or backup. |
The same input will always produce the same output, which is useful for:
-
Masking multiple related tables
-
Re-running workflows
-
Maintaining consistency across environments
Will masking break my application?
No, if configured correctly. The platform preserves:
-
All data types
-
All constraints (PK, FK, NOT NULL, CHECK)
-
All referential integrity
-
Data formats and patterns
However, you should:
-
Test with a small dataset first
-
Verify application functionality with masked data
-
Use format-preserving transformers for validated fields
-
Don’t rely on specific data values in your tests
Related Pages
-
Masking Data Guide - Comprehensive guide
-
Architecture Overview - System components
-
All Transformers - Complete reference
-
Scripting Transformer - Custom masking logic
-
Conditional Masking - Selective masking