Schema Creation

Synthesized can only insert data into the target database if the target schemas are compatible with the input schemas. Schema creation is the stage in the data processing pipeline responsible for modifying the target database schemas to match the source database.

While most database objects can be processed, some features must be manually configured for certain databases.

Synthesized offers various schema creation modes that allow you to handle your use cases.

Prerequisites

Before running a Synthesized workflow for the first time on a new data source:

Check that the username Synthesized is using has the appropriate access rights.
Check that the version of the two databases is the same.

If the Synthesized user does not have the appropriate permissions or there is a version mismatch, Synthesized may not be able to copy the schemas. In such cases, the database should be prepared manually and DO_NOT_CREATE mode should be used.

Best practices

Before running a Synthesized workflow for the first time, please check the following things:

If there is already data in the target database, create a backup.
If other workflows are generating data in the same target database, use either CREATE_IF_NOT_EXISTS mode or DO_NOT_CREATE mode.

Supported schema objects

Certain databases have functionality that is not able to be copied automatically. This table shows which features work with each database type.

Feature

MSSQL

Postgres

Oracle

MySQL

SQLite

DB2

Schemas

N/A

Tables

Indexes

Constraints

Sequences

N/A

Primary Keys

Foreign Keys

Triggers

Domains

N/A

Views

Custom Data Types

N/A

Computed Columns

Enums

N/A

Schema creation modes

To control how Synthesized prepares the target schema, use the schema_creation_mode flag. E.g.

By default, DROP_AND_CREATE is recommended, particularly when using Synthesized for MASKING. This ensures Synthesized is always operating on fresh schemas. However, this is only appropriate if all schema objects are supported.

In systems where you are generating partial data, CREATE_IF_NOT_EXISTS is recommended. This lets Synthesized create new schemas as they are required.

If you have manually prepared the database, DO_NOT_CREATE can be used to skip schema creation.

DROP_AND_CREATE

Drops the existing schemas in the target database
Creates the new schemas
Fully recreates all objects

CREATE_IF_NOT_EXISTS

Calculates the difference between input and output schemas
Creates each schema if it doesn’t exist
Copies any missing schemas from input to output

DO_NOT_CREATE

Does not create schemas in the target database
Verifies that required schemas already exist
Throws an exception if any required schema is missing

CREATE

Verifies that the target database is empty
Creates a new schema in the target database
Throws an exception if objects already exist