Changelog#

Version 1.7 #

17 Jun 2022

Version 1.7 of the Synthesized Testing Suite.


🧿 feature Custom Database Types Support#

To support custom database types:

  • Use output database with already created schema and its child objects, see the DO_NOT_CREATE schema creation mode for more details.

  • Explicitly define generator for custom type column in the configuration file.

For example, for the following custom ENUM type:

CREATE TYPE public.transaction_type_t AS ENUM ('SENT', 'RECEIVED');

Use a configuration like this:

column_params:
- columns:
  - "transaction_type"
  params:
    type: "categorical_generator"
    categories:
      type: string
      values:
      - "SENT"
      - "RECEIVED"
    probabilities:
    - 0.6
    - 0.4

For more information, see Custom database types.

🧿 feature Constant Generator#

Generate a single numeric value for the entire column

Parameters:

  • value: Number?: numeric value to generate

Compatible modes: GENERATION MASKING

Compatible column data types: NUMERIC

Supports multiple columns: No

Example:

column_params:
- columns: [ "balance" ]
  params:
    type: "constant"
    value: 0.0

For more information, see Transformations List.

🧿 feature UUID Support for MASKING#

UUID data type support for MASKING mode.

🧿 feature BIGINT and SMALLINT Support#

BIGINT and SMALLINT data type support for GENERATION, MASKING, and KEEP modes.

🧿 feature Global Seed Parameter#

global_seed to set the seed for random number generators.

An integer 32-bit value between -2147483648 and 2147483647, used a seed for random number generators. The result of generation must be the same each time the generation is being run with the same seed and workflow configuration. By default global_seed is 0.

Example:

default_config:
  mode: "MASKING"
  target_ratio: 1.0
global_seed: 42

For more information, see Configuration File.

Version 1.6 #

10 Jun 2022

Version 1.6 of the Synthesized Testing Suite.


🧿 enhancement Performance Improvements#

This release includes significant rework of transformation execution internals, bringing the following benefits to end users:

  • Heavy parallelization of transformations and database operation. To the extent the logic of transformation permits, operations are performed in parallel. That results in better hardware utilization and reduced latencies.

  • Memory consumption optimization. The solution now can handle tables with sizes noticeably exceeding main memory size of the process itself.

Version 1.5 #

8 Jun 2022

Version 1.5 of the Synthesized Testing Suite.


🧿 feature H2 Support#

H2 database can be used as input and output database.

Note

Add the following arguments to H2 JDBC URLs: ;DATABASE_TO_LOWER=TRUE;CASE_INSENSITIVE_IDENTIFIERS=TRUE

🧿 feature SQLite Support#

SQLite database can be used as input and output database.

Version 1.4 #

7 Jun 2022

Version 1.4 of the Synthesized Testing Suite.


🧿 feature License Expiration API endpoint#

The license expiration can be requested via API:

curl -X 'GET' \
  'http://${API_SERVICE_URL}:${API_SERVICE_PORT}/api/v1/license-expiration' \
  -H 'accept: */*'

Where:

  • API_SERVICE_URL is the endpoint of the service. If running locally, this will likely be localhost

  • API_SERVICE_PORT is the port exposed for the service. The default port is 8081.

If the service is up and running correctly, you should receive a 200 status with the body containing information like:

{"expiry_date":"2023-06-01"}

For more information, see License Expiration.

🧿 feature UUID Data Type Support#

UUID data type support for GENERATION and KEEP modes.

🧿 feature Boolean Data Type Support#

BOOLEAN data type support for GENERATION, MASKING, and KEEP modes.

🧿 enhancement Configuration File Upload#

YAML configuration can be uploaded as a file via API.

For more information, see Create a Workflow.

Version 1.3 #

20 May 2022

Version 1.3 of the Synthesized Testing Suite.


🧿 feature Google Secret Manager Integration#

The database credentials can be provided from Google Secret Manager:

"password": {
  "type": gcp,
  "project": "${GCP_PROJECT_ID}",
  "secret": "${SECRET_ID}",
  "version": "${VERSION_ID}"
}

For more information, see Database Credentials.

🧿 feature Append Data#

A new table_truncation_mode:

  • IGNORE: if this mode is selected, the status of the output database is ignored.

It allows not to delete existing data from the output database, but to generate additional and append above.

For more information, see Configuration File.

🧿 feature Locale For Address and Person Generators#

  • locale: String = 'en-GB': To generate names and addresses from different geographical areas, the user can change this parameter. Default to ‘en-GB’, which corresponds to British names.

Supported locales:

  • bg

  • ca

  • ca-CAT

  • da-DK

  • de

  • de-AT

  • de-CH

  • en

  • en-AU

  • en-au-ocker

  • en-BORK

  • en-CA

  • en-GB

  • en-IND

  • en-MS

  • en-NEP

  • en-NG

  • en-NZ

  • en-PAK

  • en-SG

  • en-UG

  • en-US

  • en-ZA

  • es

  • es-MX

  • fa

  • fi-FI

  • fr

  • he

  • hu

  • in-ID

  • it

  • ja

  • ko

  • nb-NO

  • nl

  • pl

  • pt

  • pt-BR

  • ru

  • sk

  • sv

  • sv-SE

  • tr

  • uk

  • vi

  • zh-CN

  • zh-TW

For more information, see Transformations List.

🧿 enhancement Null Generator by Default#

For currently unsupported types, such as XML datatype, null_generator will be used by default.

🧿 enhancement Stop Workflow API Endpoint#

Added ways to stop the workflow using workflow_id and workflow_run_id. Improved error handling.

For more information, see Stop Workflow.

🧿 enhancement Ability to Process a Subset of Tables#

Removed comparison between input and output schema. It allows to process a subset of the input tables.

🧿 bugfix Consistent Formatted Strings#

formatted_string_generator in MASKING mode generates consistent values across the schema.

🧿 bugfix Positive Output Based on Positive Input#

If the input numeric column contains only positive values, then the generated values will also be positive by default.

Version 1.2 #

29 Apr 2022

Version 1.2 of the Synthesized Testing Suite.


🧿 feature Schema Truncation Mode#

There are two table truncation modes:

  • DO_NOT_TRUNCATE: (default) if this mode is selected, tables in the output database won’t be truncated. An empty output database required.

  • TRUNCATE: if this mode is selected, tables in the output database will be truncated.

Usage example for table_truncation_mode:

default_config:
    mode: "GENERATION"
    target_ratio: 1.0
table_truncation_mode: "TRUNCATE"

🧿 feature Support CHAR Primary Keys#

MASKING mode for tables with CHAR primary keys can be used without any additional configuration. In the previous versions passthrough transformation was used as a workaround.

🧿 feature Support Composite Keys#

Composite primary and foreign keys can be automatically handled without any additional configuration. In the previous versions foreign_key_generator was used as a workaround.

🧿 enhancement Advanced Subsetting#

Advanced subsetting implementation for KEEP and MASKING modes. In the previous versions some of the tables after subsetting were empty.

🧿 enhancement CLI Parameters#

Changed CLI parameters from camelCase to kebab-case:

Usage: engine-lite [-hV] [-c=<config-file>] [-ip=<input-password>]
                   -iu=<input-url> [-iU=<input-username>]
                   [-op=<output-password>] -ou=<output-url>
                   [-oU=<output-username>]
Testing suite engine lite.
  -c, --config-file=<config-file>
                  Configuration file
  -h, --help      Show this help message and exit.
      -ip, --input-password=<input-password>
                  Input password, default to null
      -iu, --input-url=<input-url>
                  JDBC URL to the INPUT database
      -iU, --input-username=<input-username>
                  Input username, default to null
      -op, --output-password=<output-password>
                  Output password, default to null
      -ou, --output-url=<output-url>
                  JDBC URL to the OUTPUT database
      -oU, --output-username=<output-username>
                  Output username, default to null
  -V, --version   Print version information and exit.

🧿 bugfix Consistent Fake Generators#

person_generator and address_generator in MASKING mode will generate consistent values across the schema.

For example, all mentions of James Bond with UK address will be masked as Jon Snow with Seven Kingdoms address for any mentions in the schema.

Version 1.1 #

15 Apr 2022

Version 1.1 of the Synthesized Testing Suite.


🧿 feature Schema creation mode#

There are four schema creation modes:

  • CREATE_IF_NOT_EXISTS: (default) if this mode is selected, DDL schema will be copied from the source database to the target one if it does not exist, existing schema will be used otherwise.

  • DO_NOT_CREATE: if this mode is selected, existing schema will be used.

  • CREATE: if this mode is selected, DDL schema will be copied from the source database to the target one. The target database should be empty.

  • DROP_AND_CREATE: if this mode is selected, DDL schema will be copied from the source database to the target one. Existing schema in the target database will be dropped. Please use this mode carefully.

Note: If CREATE_IF_NOT_EXISTS, DO_NOT_CREATE modes are used, the target schema should be equal to the source one.

🧿 feature Address generator#

Generate address fields (e.g. street, zip code) and keep them consistent across columns.

Parameters:

  • column_templates: List<String>: For each column, the template to be used to generate address data consistent_with_column: String?: If given, the column that need to be consistent on. For example, if consistent_with_column="user_id" all people with same user_id will have the same street

Available templates are:

  • ${zip_code}

  • ${country}

  • ${city}

  • ${street_name}

  • ${house_number}

  • ${flat_number}

Compatible modes: GENERATION MASKING KEEP

Compatible column data types: STRING

Supports multiple columns: Yes

Example for multiple columns:

column_params:
  - columns: ["street_name", "zip_code"]
    params:
      type: "address_generator"
      column_templates: ["${street_name}", "${zip_code}"]

Example for a single column:

column_params:
  - columns: ["address"]
    params:
      type: "address_generator"
      column_templates: ["${country}, ${city}, ${street_name}, ${house_number}, ${flat_number}, ${zip_code}"]

🧿 feature Cycle resolution strategy#

There are two cycle resolution strategies:

  • FAIL: (default) if this mode is selected, cycle_breaker_references should be provided in the configuration file. Otherwise, execution will fail if it detects a circular reference.

  • DELETE_NOT_REQUIRED: if this mode is selected, cyclic references will be resolved automatically by removing the last nullable reference leading to the cycle.

Example for FAIL mode:

default_config:
    mode: "GENERATION"
    target_ratio: 1.0
user_table_configs:
  - table_name_with_schema: "employees"
    cycle_breaker_references: ["employees"]
cycle_resolution_strategy: "FAIL"

Where the employees table contains a cycle reference.

Example for DELETE_NOT_REQUIRED mode:

default_config:
    mode: "GENERATION"
    target_ratio: 1.0
cycle_resolution_strategy: "DELETE_NOT_REQUIRED"

Version 1.0 #

1 Apr 2022

Version 1.0 of the Synthesized Testing Suite.


First release#

We have been working hard to combine our products into a single product with enhanced architecture that will enable us to add exciting new features and optimizations!