Multiple Databases

If you want to use TDK with multiple databases, you should provide inventory and transformation configurations.

Let’s start with the inventory file. Here we describe the connection details for two input and two output databases:

#inventory.yaml
data_sources:
  sourceFirstDb:
    jdbc_url: jdbc:mysql://localhost:3307/source_db_1
    user:
      type: raw
      value: db-user
    password:
      type: raw
      value: db-password
  targetFirstDb:
    jdbc_url: jdbc:mysql://localhost:3308/target_db_1
    user:
      type: raw
      value: db-user
    password:
      type: raw
      value: db-password
  sourceSecondDb:
    jdbc_url: jdbc:mysql://localhost:3307/source_db_2
    user:
      type: raw
      value: db-user
    password:
      type: raw
      value: db-password
  targetSecondDb:
    jdbc_url: jdbc:mysql://localhost:3308/target_db_2
    user:
      type: raw
      value: db-user
    password:
      type: raw
      value: db-password
More information about the inventory.yaml file structure can be found in inventory file reference.

Next, let’s describe workflow transformations in transformation_configs and data_source_mapping using input and output databases defined in inventory.yaml.

For example:

#config.yaml
transformation_configs:
  testFirst:
    default_config:
      mode: "KEEP"
      target_ratio: 0.5
    tables:
        - table_name_with_schema: public.address
          target_ratio: 1.0
  testSecond:
    default_config:
      mode: "KEEP"
      target_ratio: 1.0
synchronised_transformations:
  - []
data_source_mapping:
  testFirst:
    source: sourceFirstDb
    targets:
      - targetFirstDb
  testSecond:
    source: sourceSecondDb
    targets:
      - targetSecondDb
More information about the config.yaml file structure can be found in configuration reference.

Now you can start the transformation using the following parameters of the command-line interface:

--inventory-file=inventory.yaml --config-file=config.yaml