Schema Creation
Synthesized can only insert data into the target database if the target schemas are compatible with the input schemas. Schema creation is the stage in the data processing pipeline responsible for modifying the target database schemas to match the source database.
While most database objects can be processed, some features must be manually configured for certain databases.
Synthesized offers various schema creation modes that allow you to handle your use cases.
Prerequisites
Before running a Synthesized workflow for the first time on a new data source:
-
Check that the username Synthesized is using has the appropriate access rights.
-
Check that the version of the two databases is the same.
If the Synthesized user does not have the appropriate permissions or there is a version mismatch, Synthesized may not
be able to copy the schemas. In such cases, the database should be prepared manually and
DO_NOT_CREATE
mode should be used.
Best practices
Before running a Synthesized workflow for the first time, please check the following things:
-
If there is already data in the target database, create a backup.
-
If other workflows are generating data in the same target database, use either
CREATE_IF_NOT_EXISTS
mode orDO_NOT_CREATE
mode.
Supported schema objects
Certain databases have functionality that is not able to be copied automatically. This table shows which features work with each database type.
Feature | MSSQL | Postgres | Oracle | MySQL | SQLite | DB2 |
---|---|---|---|---|---|---|
Schemas |
||||||
Tables |
||||||
Indexes |
||||||
Constraints |
||||||
Sequences |
||||||
Primary Keys |
||||||
Foreign Keys |
||||||
Triggers |
||||||
Domains |
||||||
Views |
||||||
Custom Data Types |
||||||
Computed Columns |
||||||
Enums |
Schema creation modes
To control how Synthesized prepares the target schema, use the schema_creation_mode
flag. E.g.
By default, DROP_AND_CREATE
is recommended, particularly when using Synthesized for MASKING
. This ensures
Synthesized is always operating on fresh schemas. However, this is only appropriate if all schema objects are supported.
In systems where you are generating partial data, CREATE_IF_NOT_EXISTS
is recommended. This lets Synthesized create
new schemas as they are required.
If you have manually prepared the database, DO_NOT_CREATE
can be used to skip schema creation.
DROP_AND_CREATE
-
Drops the existing schemas in the target database
-
Creates the new schemas
-
Fully recreates all objects
CREATE_IF_NOT_EXISTS
-
Calculates the difference between input and output schemas
-
Creates each schema if it doesn’t exist
-
Copies any missing schemas from input to output