Frequently Asked Questions (FAQ)

Common questions and answers about the Synthesized platform.

Quick Navigation

Jump to a section:

General

What databases does the platform support?

The platform supports all major relational databases:

Database Version Status

PostgreSQL

9.6+

Fully Supported

MySQL

5.7+

Fully Supported

Oracle

12c+

Fully Supported

SQL Server

2016+

Fully Supported

MariaDB

10.2+

Fully Supported

See Supported Databases for the complete list.

Can I use the platform in production?

Yes! The platform is production-ready and used by enterprises worldwide.

Recommended for Production:

Is the platform open source?

The Synthesized platform is a commercial product.

Contact us for licensing options and enterprise support packages.

Installation & Deployment

Which deployment type should I choose?

Choose based on your use case:

Deployment Type Best For Get Started

Docker Compose

Development, small teams, proof of concept

Quick Start

Kubernetes

Production, enterprise, high availability

Helm Setup

CLI

CI/CD pipelines, automation, batch jobs

CLI Installation

See Deployment Types for detailed comparison.

What are the system requirements?
Resource Minimum Recommended

RAM

4 GB

8+ GB

CPU Cores

2 cores

4+ cores

Disk Space

20 GB

50+ GB

Databases

1 concurrent

3+ concurrent

For large datasets (>100GB), increase RAM and CPU proportionally. See Performance Tuning.

Data Masking

Will masking preserve foreign keys?

Yes! The platform automatically maintains referential integrity across all tables.

How it works:

  1. Platform analyzes foreign key relationships

  2. Processes parent tables first

  3. Maintains FK mappings during transformation

  4. Ensures child tables reference valid parent keys

Learn more: How Masking Works

Can I mask only some columns?

Yes, specify which columns to mask in your workflow configuration:

transformations:
  - columns: ["email"]
    params:
      type: person_generator
      column_templates: ["${email}"]
  - columns: ["phone"]
    params:
      type: person_generator
      column_templates: ["${phone_national}"]

=== Is masking reversible?

No. The platform masking is irreversible by design for security. The original data cannot be recovered.

== Generation

=== How does the platform generate realistic data?

The platform uses statistical models learned from your data to generate realistic synthetic data with similar distributions.

=== Can I control the number of rows generated?

Yes, use the num_rows parameter:

table_schema:
  - table_name_pattern: "customers"
    num_rows: 10000

== Subsetting

=== Does subsetting maintain relationships?

Yes! The platform automatically follows foreign keys to ensure referential integrity in the subset.

=== Can I subset by date range?

Yes, use WHERE conditions:

table_schema:
  - table_name_pattern: "orders"
    where: "created_date >= '2023-01-01'"

== Performance

=== How fast is the platform?

The platform typically processes 5,000-50,000 rows/second depending on: * Transformation complexity * Database performance * Network latency * Hardware resources

=== How can I improve performance?

See Performance Tuning for optimization tips.

== Troubleshooting

=== Where can I find logs?

# Docker Compose
docker compose logs backend

# Kubernetes
kubectl logs -n tdk deployment/tdk-backend

=== How do I report a bug?

Contact support with: * Platform version * Error logs * Workflow configuration * Steps to reproduce

== See Also