Schema Creation

Synthesized can only insert data into the target database if the target schemas are compatible with the input schemas. Schema creation is the stage in the data processing pipeline responsible for modifying the target database schemas to match the source database.

While most database objects can be processed, some features must be manually configured for certain databases.

Synthesized offers various schema creation modes that allow you to handle your use cases.

Prerequisites

Before running a Synthesized workflow for the first time on a new data source:

  1. Check that the username Synthesized is using has the appropriate access rights.

  2. Check that the version of the two databases is the same.

If the Synthesized user does not have the appropriate permissions or there is a version mismatch, Synthesized may not be able to copy the schemas. In such cases, the database should be prepared manually and DO_NOT_CREATE mode should be used.

Best practices

Before running a Synthesized workflow for the first time, please check the following things:

  1. If there is already data in the target database, create a backup.

  2. If other workflows are generating data in the same target database, use either CREATE_IF_NOT_EXISTS mode or DO_NOT_CREATE mode.

Supported schema objects

Certain databases have functionality that is not able to be copied automatically. This table shows which features work with each database type.

Feature MSSQL Postgres Oracle MySQL SQLite DB2

Schemas

Tables

Indexes

Constraints

Sequences

Primary Keys

Foreign Keys

Triggers

Domains

Views

Custom Data Types

Computed Columns

Enums

Schema creation modes

To control how Synthesized prepares the target schema, use the schema_creation_mode flag. E.g.

By default, DROP_AND_CREATE is recommended, particularly when using Synthesized for MASKING. This ensures Synthesized is always operating on fresh schemas. However, this is only appropriate if all schema objects are supported.

In systems where you are generating partial data, CREATE_IF_NOT_EXISTS is recommended. This lets Synthesized create new schemas as they are required.

If you have manually prepared the database, DO_NOT_CREATE can be used to skip schema creation.

DROP_AND_CREATE

  1. Drops the existing schemas in the target database

  2. Creates the new schemas

  3. Fully recreates all objects

CREATE_IF_NOT_EXISTS

  1. Calculates the difference between input and output schemas

  2. Creates each schema if it doesn’t exist

  3. Copies any missing schemas from input to output

DO_NOT_CREATE

  1. Does not create schemas in the target database

  2. Verifies that required schemas already exist

  3. Throws an exception if any required schema is missing

CREATE

  1. Verifies that the target database is empty

  2. Creates a new schema in the target database

  3. Throws an exception if objects already exist