Performance Troubleshooting

Diagnose and resolve performance issues in platform workflows.

Overview

Common performance issues:

  • Slow workflow execution

  • High memory usage

  • CPU bottlenecks

  • Database contention

Measuring Performance

Workflow Execution Time

Check logs for timing information:

INFO  Workflow completed in 5m 32s
  - Schema discovery: 12s
  - Data reading: 2m 10s
  - Transformation: 2m 45s
  - Data writing: 35s

Throughput

Calculate rows/second:

Total rows: 500,000
Execution time: 120 seconds
Throughput: 4,166 rows/second

Expected performance: * Good: > 5,000 rows/sec * Acceptable: 1,000-5,000 rows/sec * Slow: < 1,000 rows/sec

Common Bottlenecks

1. Database Performance

Symptoms: * Long read/write times * High database CPU

Solutions:

  1. Add indexes:

CREATE INDEX idx_customer_id ON orders(customer_id);
  1. Optimize queries:

    • Use selective WHERE clauses

    • Avoid full table scans

  2. Tune database:

    • Increase shared_buffers (PostgreSQL)

    • Adjust innodb_buffer_pool_size (MySQL)

2. Memory Issues

Symptoms: * OutOfMemoryError * GC pauses * Slow processing

Solutions:

  1. Increase JVM heap:

export JAVA_OPTS="-Xmx8g -Xms4g"
  1. Reduce batch size:

default_config:
  batch_size: 5000  # Lower from 10000
  1. Process fewer tables concurrently:

default_config:
  parallel_tables: 2  # Lower from 4

3. Network Latency

Symptoms: * High time in data reading/writing * Low CPU usage

Solutions:

  1. Use same network/region for the platform and databases

  2. Enable compression:

url: jdbc:postgresql://host:5432/db?ssl=true&compression=true
  1. Increase connection pool:

spring.datasource.hikari.maximum-pool-size=20

Optimization Techniques

Horizontal Scaling

Add more agents:

# docker-compose.yml
services:
  agent:
    replicas: 4  # Process 4 tables in parallel

Transformer Optimization

Use efficient transformers:

# ✓ Fast: Simple deterministic masking
- columns: ["email"]
  type: Email

# ✗ Slow: Complex scripting
- columns: ["email"]
  type: Scripting
  config:
    script: "/* complex logic */"

Batch Size Tuning

Find optimal batch size:

# Too small: More overhead
batch_size: 1000

# Too large: Memory issues
batch_size: 100000

# Optimal: Balance throughput and memory
batch_size: 10000

Profiling

Enable detailed timing:

logging.level.io.synthesized.tdk.processor=DEBUG

Output:

DEBUG Table: customers (50000 rows)
  - Read: 12.3s (4065 rows/sec)
  - Transform: 8.7s (5747 rows/sec)
  - Write: 15.2s (3289 rows/sec)