Changelog
Version 1.36.1
28 Nov 2023
enhancement Partial database processing enhancement.
Implemented support for an ignore filter in table processing and introduced a caching strategy for improved performance in table size calculation.
enhancement Enhance handling of MSSQL timestamp data type.
The timestamp
(Transact-SQL) data type is just an incrementing number and does not preserve a date or a time.
Version 1.36.0
27 Nov 2023
feature Financial Data Generator.
Introducing the new Finance Generator.
Choose from a variety of templates, including:
-
${credit_card}
-
${bic}
-
${iban}
-
${nasdaq_ticker}
-
${nyse_ticker}
-
${stock_market}
-
${us_routing_number}
Configure specific card types with templates such as:
-
${credit_card.visa}
-
${credit_card.mastercard}
-
${credit_card.discover}
-
${credit_card.american_express}
-
${credit_card.diners_club}
-
${credit_card.jcb}
-
${credit_card.switch}
-
${credit_card.solo}
-
${credit_card.dankort}
-
${credit_card.forbrugsforeningen}
-
${credit_card.laser}
Example:
transformations:
- columns: [ "credit_card" ]
params:
type: finance_generator
column_templates: [ "${credit_card.visa}" ]
feature Realistic Text Column Generation.
TDK now intelligently applies heuristics based on column and table names, allowing for more suitable default transformation selections.
Therefore, Person Generator, Address Generator and Finance Generator can be chosen by default for GENERATION
mode.
This behavior is enabled by default but can be disabled using the use_text_column_heuristics property.
feature Foreign Key generation with Poisson distribution.
The foreign key generator now enables the generation of foreign key relationships with links that follow a Poisson distribution, offering a more accurate representation of real-world data structures.
This feature is now set as the default behavior, replacing the previous default of ROUND_ROBIN
.
However, ROUND_ROBIN
remains available for use if preferred.
enhancement New Templates for Person and Address Generators.
New templates are now supported for Person Generator:
-
${full_name}
- First name and Last name -
${company}
- Company name -
${phone_national}
- The phone number in domestic format -
${phone_international}
- The phone number in international format -
${ssn}
- U.S. Social Security Number (SSN)
New templates for Address Generator:
-
${full_address}
-
${street_address}
-
${region}
-
${latitude}
-
${longitude}
-
${coordinates}
-
${time_zone}
enhancement Default Value Truncation for Fake Generators.
Make length_exceeded_mode: TRUNCATE
default for person_generator
, address_generator
and finance_generator
.
This ensures generated values are truncated by the column’s max length instead of causing overflow errors.
enhancement Global Locale Setting.
The Default Locale property can be set globally for person_generator
, address_generator
and finance_generator
.
This setting can be overridden at the table-level configuration.
enhancement Extended Locales Support.
The expanded list of supported locales for person_generator
, address_generator
and finance_generator
.:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
enhancement Enhance handling of MSSQL Server user-defined data type aliases.
Enhanced support for user-defined data type aliases, allowing these types to be processed by the TDK.
Version 1.35.2
10 Nov 2023
enhancement Support PostgreSQL JSONB
type for json_pointer_transformer
JSON Pointer transformer now can process PostgreSQL JSONB type.
enhancement Java 17 to be the minimum required version
TDK requires at least JVM 17 to be run.
It’s needed to update JDK to be able using latest versions of TDK.
No changes required for Docker and Kubernetes users.
Version 1.35.1
6 Nov 2023
enhancement Support SQL Server text
and ntext
data types
Despite text
and ntext
are deprecated, they may be present in production schemas.
Now these types can be processed by the TDK.
Version 1.35.0
6 Nov 2023
feature Scripting transformer
With the introduction of the new Scripting Transformer, it’s allowed to write custom scripts using the Javascript programming language.
This makes it possible to extend the TDK and add any specific logic.
The Scripting Transformer can be applied to any column for the GENERATION
and MASKING
modes.
feature Preserve Foreign Key distribution
The foreign key generator now allows you to preserve the original relationship of links, which can significantly improve data quality.
This feature can be enabled by specifying the distribution ORIGINAL
for the foreign key columns:
tables:
- table_name_with_schema: "public.orders"
transformations:
- columns: [ "customer_id" ]
params:
type: "foreign_key_generator"
distribution: ORIGINAL
Version 1.34.0
Version 1.32.0
22 Aug 2023
feature TDK is now available on GCP Marketplace
The TDK is now available on the GCP Marketplace. Getting started page https://console.cloud.google.com/marketplace/product/synthesized-marketplace-public/synthesized-tdk
Version 1.31.0
Version 1.30.0
28 Jul 2023
feature Hashicorp Vault as a secret manager
Added support for Hashicorp Vault secret manager.
enhancement Usability improvements for the date generator
Before this release, for GENERATION
mode for date generator we had to set std
option in milliseconds.
Now the following formats are supported:
-
ISO-8601 Duration format, e.g.,
P1DT2H3M4.058S
. -
The concise format described here, e.g.,
10s
,1h 30m
or-(1h 30m)
-
Milliseconds without the specific unit, e.g.,
12534
.
bugfix Fix UUID transformer behaviour in masking mode
Starting from release v1.17, the UUID transformer worked as a pure generator, it never took into account input values. Therefore it had no utility when processing FK-connected columns. In this release, the correct behaviour for the masking mode was restored.
Version 1.29.0
04 Jul 2023
feature DEFER_FOREIGN_KEY
Cycle Resolution Strategy
New DEFER_FOREIGN_KEY
Cycle Resolution Strategy: when selected - all FK references will be preserved, but the ones that lead to cycles will be disabled during masking and then re-enabled after data is inserted.
This strategy is suitable for databases with cyclic schema and works only with MASKING
mode without subsetting.
enhancement Using the Kubernetes TTL mechanism to delete completed pods
Extend TDK CLI Helm chart with the property ttlSecondsAfterFinished
. This allows Kubernetes pods to be removed in specified number of seconds after they have completed.
enhancement Bring back views for MySQL DDL copying
Copy DDL for views was previously disabled for MySQL.
enhancement Show all found errors happening during the effective configuration creation
The process of creating an efficient configuration is no longer aborted after the first detected error.
Version 1.28.0
15 Jun 2023
feature New Tutorials
New tutorials for Masking, Generation, Subsetting and Data Filtering.
bugfix Handling zero date
For more information, refer to java.sql.SQLException: Zero date value prohibited page.
Version 1.27
31 May 2023
bugfix Fix https://github.com/synthesized-io/pagila-tdk-demo failure during startup on ARM-based machines
Version 1.26
10 May 2023
enhancement Getting started page now uses Pagila Docker-compose demo
Version 1.25
21 Apr 2023
Version 1.25 of the Synthesized TDK.
feature Subsetting mode is now available in the free version of TDK
Starting from this version, the Subsetting mode is available in the free version of TDK. For more details about the Subsetting mode, please see here.
Version 1.24
04 Apr 2023
Version 1.24 of the Synthesized TDK.
feature TDK is now available on AWS Marketplace
The TDK is now available on the AWS Marketplace with Docker, ECS Fargate, and Helm charts delivery options for . See more details here: AWS Marketplace.
feature TDK Docker container is now available
TDK can now be launched not only via command line interface but as a Docker image, see details here: Docker.
Version 1.23
7 Mar 2023
Version 1.23 of the Synthesized TDK.
feature AWS S3 support for configuration loading
To be able to read the configuration file from AWS S3 you need to enable the TDK_AWS_ENABLED
property, see Application properties.
feature AWS Secrets Manager support
Database credentials can be requested from AWS Secrets Manager:
{
"type": "aws",
"secret": "${SECRET_ID}",
"version": "${VERSION_ID}"
}
Where:
-
type
: password provider type -
secret
- The ARN or name of the secret to retrieve, -
version
(optional) - The unique identifier of the version of the secret to retrieve. If you don’t specify the version, then theAWSCURRENT
version is used.
Note
|
Version 1.22
14 Feb 2023
Version 1.22 of the Synthesized TDK.
feature Ability to provide Foreign Keys in the yaml configuration file
By default, TDK preserves referential integrity based on the foreign keys in the source database schema. In this release, added the ability to provide additional foreign keys in the yaml configuration file.
For example, if order.user_id
is a foreign key referred to user.id
, and it’s not defined in the database schema, then the following configuration can be provided:
default_config:
mode: "MASKING"
target_ratio: 0.5
metadata:
tables:
- table_name_with_schema: "public.order"
foreign_keys:
fk_user_order:
referred_schema: "public"
referred_table: "user"
columns:
- column: "user_id"
referred_column: "id"
More details about the additional foreign keys in the Configuration reference section.
enhancement Performance Improvements
Significant performance improvement for the subsetting mode.
To enable the performance improvement, the following application property should be set:
TDK_WORKINGDIRECTORY_ENABLED=true
TDK_WORKINGDIRECTORY_PATH=/home/tdk/working-directory
enhancement Constant generators in the relaxed mode
In the RELAXED
mode, constant generators constant_numeric
, constant_string
, constant_date
, constant_boolean
will be chosen by default where the source column contains the same value in all rows.
enhancement FAQ page
FAQ page in the documentation.
Version 1.21
31 Jan 2023
Version 1.21 of the Synthesized TDK.
feature Safety Mode
By default, STRICT
mode is enabled. If no suitable transform is found, then the passthrough
, null_generator
and categorical_generator
will not be chosen by default.
This change breaks compatibility with the previous version. |
To keep the behavior of previous versions, you can use the RELAXED
mode.
feature Data Filtering
Data Filtering feature.
feature Multiple Database
Multiple Database support.
Version 1.20
13 Jan 2022
Version 1.20 of the Synthesized TDK.
feature YAML anchors and aliases support
YAML anchors and aliases support to reduce repeated sections in the configuration.
enhancement Download page
The latest free TDK CLI version is currently available in the documentation for download.
enhancement GitHub Actions Integration
GitHub Actions Integration page for the free TDK CLI version in the documentation.
enhancement Mutually exclusive options have been turned into separate commands
--dry-run
, --default-config
, --json-schema
, --license-expiration
options have been turned into separate commands:
.....
Commands:
help Display help information about the specified command.
default-config, dc Print built-in default configuration (can be
overridden in the user config)
dry-run, dr Print the effective configuration instead of running
the transformation to the console (by default) or
to the file (specified as `-ec` parameter value)
json-schema, js Print json schema for configuration YAML file
license-expiration, le Print the expiration date of the license key
enhancement Ability to save effective config into custom file
dry-run
command prints the effective configuration to the console (by default) or to the file (specified as -ec
or --effective-config-file
parameter value).
enhancement Added help
command to display detailed information by each command
For example tdk.jar help dry-run
:
Usage: engine-lite dry-run [-ec=<effective-config-file>]
Print the effective configuration instead of running the transformation to the
console (by default) or to the file (specified as `-ec` parameter value)
-ec, --effective-config-file=<effective-config-file>
Effective configuration file
Version 1.19
19 Dec 2022
Version 1.19 of the Synthesized TDK.
bugfix Oracle identifier is too long error when working with Oracle database
Fixed the behaviour where ORA-00972: identifier is too long
error appeared when working with Oracle DB instances.
Version 1.18
9 Dec 2022
Version 1.18 of the Synthesized TDK.
feature Mapping expressions on read and on write
Allows transforming columns as they are being read from the input database and written to the output database.
For example, a column might need a cast to a different type on read and then a cast back to the original type on write. The following configuration might be provided to address that:
tables:
- table_name_with_schema: "test_schema.test_table"
mode: "MASKING"
transformations:
- columns: ["my_binary_column"]
mapping:
read: "cast(? as char)"
write: "cast(? as binary)"
The my_binary_column
will be cast to char
type on read and the result of transformation will be cast back to
binary
type on write.
feature Support for unsigned numeric types
TDK is now aware of unsigned numeric types supported by some DBMS (MySQL, Oracle).
Version 1.17
28 Nov 2022
Version 1.17 of the Synthesized TDK.
feature Filter Schemas
Introduced the ability to define the list of schemas to process:
schemas
: array of String
. If not set or null
, all schemas available to the source database user will be processed.
Example:
default_config:
mode: GENERATION
target_ratio: 2.0
schemas: ["accounts", "payments"]
schema_creation_mode: CREATE_IF_NOT_EXISTS
table_truncation_mode: TRUNCATE
feature JSON transformer
json_pointer_transformer
transforms JSON value nodes indicated by JSON pointers, the rest of the values are kept as is:
transformations:
- columns: ["productspec"]
params:
type: "json_pointer_transformer"
specifications:
- pointers: [ "/sku" ]
transformation:
type: "format_preserving_hashing"
- pointers: [ "/tags/0" ]
transformation:
type: "format_preserving_hashing"
ignore_errors: true
Refer to JSON Pointer Transformer for more details.
enhancement Better tuned default rules for generation from empty database
Introduced more empirical default rules for generation from empty database. If a source table is empty and the generated field is not null and no user configuration provided for this field, then reasonable defaults for generators are chosen for the following data types:
-
DATE
a random date from 1970-01-01 to 2030-01-01 -
NUMERIC
a random integer from 1 to 100 -
ANY
(blobs and binary arrays) a single random byte
Version 1.16
17 Nov 2022
Version 1.16 of the Synthesized TDK.
enhancement Improved Yaml Configuration Structure
This change breaks compatibility with the previous version. |
The following yaml configuration parameters have been renamed:
-
column_params
→transformations
-
user_table_configs
→tables
Example before:
default_config:
mode: MASKING
target_ratio: 1.0
user_table_configs:
- table_name_with_schema: "public.delivery"
column_params:
- columns: ["status"]
params:
type: categorical_generator
Example after:
default_config:
mode: MASKING
target_ratio: 1.0
tables:
- table_name_with_schema: "public.delivery"
transformations:
- columns: ["status"]
params:
type: categorical_generator
Version 1.15
11 Nov 2022
Version 1.15 of the Synthesized TDK.
feature Testcontainers Integration
By combining Testcontainers with Synthesized TDK
, developers can populate any Testcontainers database with synthetically generated data, enabling rapid development of tests for logic which involves interaction with the database. Refer to documentation for more details.
feature Boost performance using working directory on a local file system
A transient local storage area can now be configured to speed up TDK operations. Refer to documentation for more details.
feature --json-schema parameter for CLI
Introduced --json-schema
parameter for CLI which prints JSON Schema for the YAML Configuration. This schema can be used in an IDE to provide auto-completion for your YAML and to validate your configuration before run.
feature Output alphabets in format_preserving_hashing
Introduced the ability to define the output alphabets in format_preserving_hashing
with unicode_block
and unicode_range
.
Refer to custom alphabets for more details.
enhancement Better tuned default rules for generation from empty database
Introduced more empirical default rules for generation from empty database.
enhancement Oracle user permissions
Added the minimum database permissions required to run Synthesized TDK with Oracle database, see Oracle permissions.
enhancement Preserve null values for all masking transformers
Preserve null values for all masking transformers.
bugfix Make CategoricalGenerator’s masking mode the same as in generation
Fixed the behaviour when CategoricalGenerator
didn’t preserve the input probabilities in masking mode.
Version 1.14
30 Sep 2022
Version 1.14 of the Synthesized TDK.
feature Significant performance improvement
Significant performance improvement for MASKING
and GENERATION
modes for all supported databases.
feature GENERATION mode for empty tables
To use GENERATION
mode for empty tables, the user should specify target_row_number
at the global or table level:
target_row_number
: optional Integer (int64).
The absolute size of the output table in rows.
This parameter is applicable only for GENERATION
mode.
If not provided, target_ratio
will be used.
Version 1.13
16 Sep 2022
Version 1.13 of the Synthesized TDK.
enhancement Improved performance of all transformations up to 30%
Improved performance of all transformations up to 30%.
enhancement Improved data quality and reduce memory consumption of date_generator
Improved data quality and reduce memory consumption of date_generator
.
Version 1.12
2 Sep 2022
Version 1.12 of the Synthesized TDK.
enhancement New unique_hashing algorithm
New unique_hashing
algorithm:
-
provides random bijective (one-to-one) mapping between unique identifiers of the input and the output databases
-
prevents key collisions – unique input keys are mapped to unique output keys
-
2x performance improvement in
MASKING
scenarios due to the absence of key collisions.
enhancement Microsoft SQL Server IDENTITY property support
Microsoft SQL Server IDENTITY
property support.
Version 1.11
23 Aug 2022
Version 1.11 of the Synthesized TDK.
enhancement New format_preserving_hashing Algorithm
New format_preserving_hashing
algorithm:
-
supports Unicode’s Basic Multilingual Plane in input and output data
-
maximum length of input text is increased to 232 characters
-
performance increased to 30-60% (the boost may vary depending on the message size and hashing groups configuration).
enhancement Required Database Permissions
Database permissions page describing the minimum database permissions required to run Synthesized TDK.
enhancement length_exceeded_mode for fake generators
A new parameter length_exceeded_mode
is added for address_generator
and person_generator
, it allows to truncate the generated values by the column size.
For example, if the country column is varchar(20)
, and the generated value is "Saint Vincent And The Grenadines":
-
length_exceeded_mode: IGNORE
(default) fails to insert "Saint Vincent And The Grenadines" to thevarchar(20)
column -
length_exceeded_mode: TRUNCATE
mode inserts the truncated value "Saint Vincent And Th" to thevarchar(20)
column.
Version 1.10
12 Aug 2022
Version 1.10 of the Synthesized TDK.
feature BigID integration
For more information, see BigID integration.
feature CREATE_IF_NOT_EXISTS support for Microsoft SQL Server
CREATE_IF_NOT_EXISTS
schema creation mode support for Microsoft SQL Server.
feature Autogenerated Documentation
Transformations and YAML configuration are now autogenerated and up-to-date.
enhancement Quick Start
Getting Started and Installation sections in the documentation with H2 demo database.
enhancement noising and continuous_generator enhancement
Columns with a single value do not fail with an error, but are kept as-is for MASKING
mode and filled with nulls for GENERATION
.
enhancement Performance improvement for fake generators
Huge performance improvement for address_generator
and person_generator
.
Version 1.9
29 Jul 2022
Version 1.9 of the Synthesized TDK.
feature Advanced format_preserving_hashing configuration
A hash transformation is applied to each character, which included into the configured group, in a given text so that the output preserves the format but contains different characters. This transformation is secure and non-reversible.
Parameters:
-
groups: List<FormatPreservingHashingGroup>
: The pair ofselector
and list ofalphabet
.selector
is used to choose characters from the input string,alphabet
- is a set of characters, which are used to replace source ones. -
filter
: Filters are used to mask only a specified substring and keep other characters as is (e.g., mask only last 5 characters).
Available character selectors:
-
numeric
-
lower_letters
-
upper_letters
-
regex
Available alphabets:
-
numeric
-
lower_letters
-
upper_letters
-
custom
Available filters:
-
first
- Mask only firstn
characters. -
last
- Mask only lastn
characters. -
characters
- Mask only specified characters. Parameters:characters
- set of characters to mask,ignore_case
(default: false) - indicates if case is taken into account. -
substring
- Mask all occurrences of specified substring. Parameters:substring
- Substring to mask,ignore_case
(default: false) - indicates if case is taken into account. -
regex
- Mask only characters matching by specified Regex pattern. Parameters:pattern
- Regex pattern to find characters to mask,ignore_case
(default: false) - indicates if case is taken into account.
For more information, see Transformations.
feature constant_numeric, constant_date, constant_string, constant_boolean generators
Added new constant generators for numeric, date, string, boolean data types.
For more information, see Transformations.
feature Numeric type for categorical_generator
categorical_generator supports numeric columns.
For more information, see Transformations.
Version 1.8
8 Jul 2022
Version 1.8 of the Synthesized TDK.
enhancement New Documentation
In addition to the nice appearance, many pages and yaml examples are now generated automatically from source code and tests, which reduces the number of mistakes and allows the documentation to be up-to-date with the product version.
Enjoy!
enhancement Performance Improvement for MASKING
This release includes significant performance improvement for MASKING
mode with target_ratio: 1.0
.
bugfix Handle Empty Tables
Fixed issues with processing empty tables in MASKING
and GENERATION
modes.
Version 1.7
17 Jun 2022
Version 1.7 of the Synthesized TDK.
feature Custom Database Types Support
To support custom database types:
-
Use output database with already created schema and its child objects, see the
DO_NOT_CREATE
in YAML configuration for more details. -
Explicitly define generator for custom type column in the configuration file.
For example, for the following custom ENUM type:
CREATE TYPE public.transaction_type_t AS ENUM ('SENT', 'RECEIVED');
Use a configuration like this:
transformations:
- columns:
- "transaction_type"
params:
type: "categorical_generator"
categories:
type: string
values:
- "SENT"
- "RECEIVED"
probabilities:
- 0.6
- 0.4
For more information, see Custom database types.
feature Constant Generator
Generate a single numeric value for the entire column
Parameters:
-
value: Number?
: numeric value to generate
Compatible modes: GENERATION,badge-primary
MASKING,badge-secondary
Compatible column data types: NUMERIC
Supports multiple columns: No,badge-danger
Example:
transformations:
- columns: [ "balance" ]
params:
type: "constant"
value: 0.0
For more information, see Transformations.
feature BIGINT and SMALLINT Support
BIGINT
and SMALLINT
data type support for GENERATION
, MASKING
,
and KEEP
modes.
feature Global Seed Parameter
global_seed
to set the seed for random number generators.
An integer 32-bit
value between -2147483648
and 2147483647
, used a
seed for random number generators. The result of generation must be the
same each time the generation is being run with the same seed and
workflow configuration. By default global_seed
is 0
.
Example:
default_config:
mode: "MASKING"
target_ratio: 1.0
global_seed: 42
For more information, see YAML configuration.
Version 1.6
10 Jun 2022
Version 1.6 of the Synthesized TDK.
enhancement Performance Improvements
This release includes significant rework of transformation execution internals, bringing the following benefits to end users:
-
Heavy parallelization of transformations and database operation. To the extent the logic of transformation permits, operations are performed in parallel. That results in better hardware utilization and reduced latencies.
-
Memory consumption optimization. The solution now can handle tables with sizes noticeably exceeding main memory size of the process itself.
Version 1.5
Version 1.4
7 Jun 2022
Version 1.4 of the Synthesized TDK.
feature License Expiration API endpoint
The license expiration can be requested via API:
curl -X 'GET' \
'http://${API_SERVICE_URL}:${API_SERVICE_PORT}/api/v1/license-expiration' \
-H 'accept: */*'
Where:
-
API_SERVICE_URL
is the endpoint of the service. If running locally, this will likely belocalhost
-
API_SERVICE_PORT
is the port exposed for the service. The default port is8081
.
If the service is up and running correctly, you should receive a 200
status with the body containing information like:
{"expiry_date":"2023-06-01"}
Version 1.3
20 May 2022
Version 1.3 of the Synthesized TDK.
feature Google Secret Manager Integration
The database credentials can be provided from Google Secret Manager:
"password": { "type": gcp, "project": "${GCP_PROJECT_ID}", "secret": "${SECRET_ID}", "version": "${VERSION_ID}" }
feature Append Data
A new table_truncation_mode
:
-
IGNORE
: if this mode is selected, the status of the output database is ignored.
It allows not to delete existing data from the output database, but to generate additional and append above.
For more information, see YAML configuration.
feature Locale For Address and Person Generators
-
locale: String = 'en-GB'
: To generate names and addresses from different geographical areas, the user can change this parameter. Default to 'en-GB', which corresponds to British names.
Supported locales:
bg
ca
ca-CAT
da-DK
de
de-AT
de-CH
en
en-AU
en-au-ocker
en-BORK
en-CA
en-GB
en-IND
en-MS
en-NEP
en-NG
en-NZ
en-PAK
en-SG
en-UG
en-US
en-ZA
es
es-MX
fa
fi-FI
fr
he
hu
in-ID
it
ja
ko
nb-NO
nl
pl
pt
pt-BR
ru
sk
sv
sv-SE
tr
uk
vi
zh-CN
zh-TW
For more information, see Transformations.
enhancement Null Generator by Default
For currently unsupported types, such as XML datatype, null_generator
will be used by default.
enhancement Stop Workflow API Endpoint
Added ways to stop the workflow using workflow_id
and
workflow_run_id
. Improved error handling.
enhancement Ability to Process a Subset of Tables
Removed comparison between input and output schema. It allows to process a subset of the input tables.
Version 1.2
29 Apr 2022
Version 1.2 of the Synthesized TDK.
feature Schema Truncation Mode
There are two table truncation modes:
-
DO_NOT_TRUNCATE
: (default) if this mode is selected, tables in the output database won’t be truncated. An empty output database required. -
TRUNCATE
: if this mode is selected, tables in the output database will be truncated.
Usage example for table_truncation_mode
:
default_config:
mode: "GENERATION"
target_ratio: 1.0
table_truncation_mode: "TRUNCATE"
feature Support CHAR Primary Keys
MASKING
mode for tables with CHAR primary keys can be used without any
additional configuration. In the previous versions passthrough
transformation was used as a workaround.
feature Support Composite Keys
Composite primary and foreign keys can be automatically handled without
any additional configuration. In the previous versions
foreign_key_generator
was used as a workaround.
enhancement Advanced Subsetting
Advanced subsetting implementation for KEEP
and MASKING
modes. In
the previous versions some of the tables after subsetting were empty.
enhancement CLI Parameters
Changed CLI parameters from camelCase to kebab-case:
Usage: engine-lite [-hV] [-c=<config-file>] [-ip=<input-password>]
-iu=<input-url> [-iU=<input-username>]
[-op=<output-password>] -ou=<output-url>
[-oU=<output-username>]
TDK engine lite.
-c, --config-file=<config-file>
Configuration file
-h, --help Show this help message and exit.
-ip, --input-password=<input-password>
Input password, default to null
-iu, --input-url=<input-url>
JDBC URL to the INPUT database
-iU, --input-username=<input-username>
Input username, default to null
-op, --output-password=<output-password>
Output password, default to null
-ou, --output-url=<output-url>
JDBC URL to the OUTPUT database
-oU, --output-username=<output-username>
Output username, default to null
-V, --version Print version information and exit.
Version 1.1
15 Apr 2022
Version 1.1 of the Synthesized TDK.
feature Schema creation mode
There are four schema creation modes:
-
CREATE_IF_NOT_EXISTS
: (default) if this mode is selected, DDL schema will be copied from the source database to the target one if it does not exist, existing schema will be used otherwise. -
DO_NOT_CREATE
: if this mode is selected, existing schema will be used. -
CREATE
: if this mode is selected, DDL schema will be copied from the source database to the target one. The target database should be empty. -
DROP_AND_CREATE
: if this mode is selected, DDL schema will be copied from the source database to the target one. Existing schema in the target database will be dropped. Please use this mode carefully.
Note: If CREATE_IF_NOT_EXISTS
, DO_NOT_CREATE
modes are used, the
target schema should be equal to the source one.
feature Address generator
Generate address fields (e.g. street, zip code) and keep them consistent across columns.
Parameters:
-
column_templates: List<String>
: For each column, the template to be used to generate address dataconsistent_with_column: String?
: If given, the column that need to be consistent on. For example, ifconsistent_with_column="user_id"
all people with sameuser_id
will have the same street
Available templates are:
-
${zip_code}
-
${country}
-
${city}
-
${street_name}
-
${house_number}
-
${flat_number}
Compatible modes: GENERATION,badge-primary
MASKING,badge-secondary
KEEP,badge-warning
Compatible column data types: STRING
Supports multiple columns: Yes,badge-success
Example for multiple columns:
transformations:
- columns: ["street_name", "zip_code"]
params:
type: "address_generator"
column_templates: ["${street_name}", "${zip_code}"]
Example for a single column:
transformations:
- columns: ["address"]
params:
type: "address_generator"
column_templates: ["${country}, ${city}, ${street_name}, ${house_number}, ${flat_number}, ${zip_code}"]
feature Cycle resolution strategy
There are two cycle resolution strategies:
-
FAIL
: (default) if this mode is selected,cycle_breaker_references
should be provided in the configuration file. Otherwise, execution will fail if it detects a circular reference. -
DELETE_NOT_REQUIRED
: if this mode is selected, cyclic references will be resolved automatically by removing the last nullable reference leading to the cycle.
Example for FAIL
mode:
default_config:
mode: "GENERATION"
target_ratio: 1.0
tables:
- table_name_with_schema: "employees"
cycle_breaker_references: ["employees"]
cycle_resolution_strategy: "FAIL"
Where the employees table contains a cycle reference.
Example for DELETE_NOT_REQUIRED
mode:
default_config:
mode: "GENERATION"
target_ratio: 1.0
cycle_resolution_strategy: "DELETE_NOT_REQUIRED"