How Masking Works

Understand the masking engine’s internals, how transformers replace sensitive data, and how the platform preserves referential integrity during masking.

Overview

Data masking replaces sensitive information with realistic but fake data. The platform’s masking engine reads every row from the source database, applies transformers to specified columns, and writes the masked data to the destination while preserving referential integrity, data types, and business logic.

MASKING Mode

Key Characteristics

Preserves row count: Output has same number of rows as input
Preserves IDs: Primary keys typically kept unchanged
1:1 mapping: Each source row maps to exactly one destination row
Schema preserved: Same tables, columns, data types
Referential integrity maintained: Foreign keys remain valid

Masking Process

Example: Before and After Masking

Table 1. **Customer Table Transformation**
Column	Original Data	Masked Data	Transformation
`customer_id`	1	1	Preserved (Primary Key)
`first_name`	John	Alice	Realistic name generated
`last_name`	Doe	Smith	Realistic name generated
`email`	john.doe@company.com	alice.smith@example.com	Format preserved, content masked
`phone`	+1-555-123-4567	+1-555-987-6543	Format preserved, digits randomized
`created_at`	2023-01-15	2023-01-15	Preserved (non-sensitive)

Notice how the masked data:

Maintains all data types
Preserves primary keys for referential integrity
Generates realistic values that pass validation
Keeps non-sensitive data unchanged

When to Use Masking

Masking is ideal when you need to:

Anonymize production data for non-production environments
Maintain exact row counts and relationships
Preserve business logic and data distributions
Comply with data privacy regulations (GDPR, HIPAA, etc.)

Comparing Modes: To understand how masking differs from generation and subsetting, see Mode Comparison.

Common Questions

Is masking reversible?

No. Masking is intentionally irreversible for security. Once data is masked, the original values cannot be recovered. This is a feature, not a limitation - it ensures that masked data cannot be de-anonymized.

Never mask your only copy of production data. Always mask a copy or backup.

The same input will always produce the same output, which is useful for:

Masking multiple related tables
Re-running workflows
Maintaining consistency across environments

Will masking break my application?

No, if configured correctly. The platform preserves:

All data types
All constraints (PK, FK, NOT NULL, CHECK)
All referential integrity
Data formats and patterns

However, you should:

Test with a small dataset first
Verify application functionality with masked data
Use format-preserving transformers for validated fields
Don’t rely on specific data values in your tests

Masking Data Guide - Comprehensive guide
Architecture Overview - System components
All Transformers - Complete reference
Scripting Transformer - Custom masking logic
Conditional Masking - Selective masking

How Masking Works

Overview

MASKING Mode

Key Characteristics

Masking Process

Example: Before and After Masking

When to Use Masking

Common Questions

Related Pages