Deployment Modes

Understanding the platform’s deployment modes - Worker mode, CLI mode, and server mode.

Overview

The platform can run in different modes depending on your deployment type and use case.

Deployment Modes

1. Server Mode (Backend + Workers)

Architecture:

Backend Server ← → Workers
     ↓
Metadata DB

How it works:

  • Backend server receives workflow requests (UI or API)

  • Backend coordinates execution across workers

  • Workers perform actual data transformation

  • Progress reported back to backend

  • Results stored in metadata database

Use cases:

  • Multi-user environments

  • Web UI needed

  • Scheduled workflows

  • Workflow history required

Deployment: Docker Compose or Kubernetes

2. CLI Mode (Standalone)

Architecture:

TDK CLI → Direct DB-to-DB transformation

How it works:

  • Single process executes entire workflow

  • Reads configuration from YAML files

  • No backend server needed

  • No metadata database

  • Outputs logs to stdout

Use cases:

  • CI/CD pipelines

  • Automated scripts

  • Simple transformations

  • No UI needed

Deployment: Standalone binary or Docker container

3. Hybrid Mode

Architecture:

Backend Server ← → Workers + CLI (parallel)

How it works:

  • Backend for interactive workflows

  • CLI for automated workflows

  • Both access same databases

  • Workflow history in backend only

Use cases:

  • Interactive + automated workflows

  • Team collaboration + CI/CD

  • Manual testing + automated deployment

Worker Mode Details

Single Worker

Simplest worker setup:

Docker Compose:

services:
  backend:
    ...
  worker:
    image: synthesizedio/synthesized-worker
    environment:
      BACKEND_URL: http://backend:8080

Behavior:

  • One worker processes all tables sequentially

  • Good for small/medium workloads

  • Simple configuration

Multiple Workers

Scale horizontally:

Docker Compose:

services:
  backend:
    ...
  worker:
    image: synthesizedio/synthesized-worker
    deploy:
      replicas: 3  # 3 worker instances
    environment:
      BACKEND_URL: http://backend:8080

Behavior:

  • Backend distributes work across workers

  • Independent tables processed in parallel

  • Faster overall execution

Example:

Worker 1: customer table
Worker 2: products table
Worker 3: categories table
(all running simultaneously)

Worker Communication

Workers communicate with backend via:

  • Registration: Worker announces availability

  • Heartbeat: Periodic health checks

  • Task assignment: Backend assigns tables to workers

  • Progress updates: Workers report processing status

  • Completion: Final results reported

CLI Mode Details

Running CLI

Direct execution:

tdk \
  --config-file workflow.yaml \
  --inventory-file inventory.yaml

Docker:

docker run synthesizedio/synthesized-tdk-cli \
  -v $(pwd)/config:/config \
  --config-file /config/workflow.yaml \
  --inventory-file /config/inventory.yaml

Kubernetes Job:

apiVersion: batch/v1
kind: Job
metadata:
  name: tdk-masking-job
spec:
  template:
    spec:
      containers:
      - name: tdk
        image: synthesizedio/synthesized-tdk-cli
        args:
          - --config-file
          - /config/workflow.yaml

CLI vs Worker Mode

Aspect CLI Mode Worker Mode

Configuration

YAML files

UI or API + YAML

Execution

Single process

Distributed

Scaling

Vertical only

Horizontal + Vertical

History

Logs only

Metadata DB

UI

None

Web UI

Scheduling

External (cron, etc.)

Built-in

Choosing a Mode

Use Server + Worker Mode if:

  • Multiple users need access

  • Web UI required

  • Workflow history/auditing needed

  • Scheduled workflows

  • Need horizontal scaling

Use CLI Mode if:

  • Single-purpose automation

  • CI/CD integration

  • No UI needed

  • Simple workflows

  • Minimal infrastructure

Use Hybrid Mode if:

  • Team collaboration (server)

  • Plus automated deployment (CLI)

  • Interactive + scripted workflows

Configuration Differences

Server Mode Config

Workflows stored in backend database, managed via UI/API.

CLI Mode Config

Workflows in YAML files:

workflow.yaml:

default_config:
  mode: MASKING

table_schema:
  - table_name_pattern: "public.*"
  ...

inventory.yaml:

data_sources:
  input:
    url: jdbc:postgresql://...
  output:
    url: jdbc:postgresql://...