Performance Troubleshooting

Diagnose and resolve performance issues in platform workflows.

Overview

Common performance issues:

Slow workflow execution
High memory usage
CPU bottlenecks
Database contention

Measuring Performance

Workflow Execution Time

Check logs for timing information:

INFO  Workflow completed in 5m 32s
  - Schema discovery: 12s
  - Data reading: 2m 10s
  - Transformation: 2m 45s
  - Data writing: 35s

Throughput

Calculate rows/second:

Total rows: 500,000
Execution time: 120 seconds
Throughput: 4,166 rows/second

Expected performance: * Good: > 5,000 rows/sec * Acceptable: 1,000-5,000 rows/sec * Slow: < 1,000 rows/sec

Common Bottlenecks

1. Database Performance

Symptoms: * Long read/write times * High database CPU

Solutions:

Add indexes:

CREATE INDEX idx_customer_id ON orders(customer_id);

Optimize queries:
- Use selective WHERE clauses
- Avoid full table scans
Tune database:
- Increase shared_buffers (PostgreSQL)
- Adjust innodb_buffer_pool_size (MySQL)

2. Memory Issues

Symptoms: * OutOfMemoryError * GC pauses * Slow processing

Solutions:

Increase JVM heap:

export JAVA_OPTS="-Xmx8g -Xms4g"

Reduce batch size:

default_config:
  batch_size: 5000  # Lower from 10000

Process fewer tables concurrently:

default_config:
  parallel_tables: 2  # Lower from 4

3. Network Latency

Symptoms: * High time in data reading/writing * Low CPU usage

Solutions:

Use same network/region for the platform and databases
Enable compression:

url: jdbc:postgresql://host:5432/db?ssl=true&compression=true

Increase connection pool:

spring.datasource.hikari.maximum-pool-size=20

Optimization Techniques

Horizontal Scaling

Add more agents:

# docker-compose.yml
services:
  agent:
    replicas: 4  # Process 4 tables in parallel

See: Scaling Guide

Transformer Optimization

Use efficient transformers:

# ✓ Fast: Simple deterministic masking
- columns: ["email"]
  type: Email

# ✗ Slow: Complex scripting
- columns: ["email"]
  type: Scripting
  config:
    script: "/* complex logic */"

Batch Size Tuning

Find optimal batch size:

# Too small: More overhead
batch_size: 1000

# Too large: Memory issues
batch_size: 100000

# Optimal: Balance throughput and memory
batch_size: 10000

Profiling

Enable detailed timing:

logging.level.io.synthesized.tdk.processor=DEBUG

Output:

DEBUG Table: customers (50000 rows)
  - Read: 12.3s (4065 rows/sec)
  - Transform: 8.7s (5747 rows/sec)
  - Write: 15.2s (3289 rows/sec)

Performance Troubleshooting

Overview

Measuring Performance

Workflow Execution Time

Throughput

Common Bottlenecks

1. Database Performance

2. Memory Issues

3. Network Latency

Optimization Techniques

Horizontal Scaling

Transformer Optimization

Batch Size Tuning

Profiling

See Also