Locale Reference

The platform supports 51 locales for generating realistic, culturally-appropriate data with person_generator and address_generator transformers.

Overview

Locales enable the platform to generate data that matches the cultural and linguistic conventions of specific regions. This includes:

  • Person data: Names, emails, usernames, phone numbers, SSN/tax IDs, titles, company names

  • Address data: Street addresses, cities, regions, postal codes, countries, coordinates, timezones

Default Locale: The platform uses en-GB (English - Great Britain) as the default locale when none is specified.

Supported Locales

The platform supports the following 51 locales:

ar

bg

ca

ca-CAT

cs

da-DK

de

de-AT

de-CH

en

en-AU

en-CA

en-GB

en-IND

en-MS

en-NEP

en-NG

en-NZ

en-PAK

en-SG

en-UG

en-US

en-ZA

en-PH

es

es-MX

fa

fi-FI

fr

he

hu

in-ID

it

ja

ko

nb-NO

nl

pl

pt

pt-BR

ru

sk

sv

sv-SE

tr

uk

vi

zh-CN

zh-TW

Transformers with Locale Support

person_generator

Generates realistic person-related data with locale-specific formatting:

  • First names and last names (culturally appropriate)

  • Email addresses

  • Usernames

  • Phone numbers (regional format)

  • SSN/National Insurance numbers

  • Titles (Mr., Mrs., Dr., etc.)

  • Company names

- columns: ["first_name", "last_name", "email"]
  type: person_generator
  params:
    locale: "de"  # German names and formats

address_generator

Generates realistic addresses with locale-specific formatting:

  • Street addresses

  • City names

  • Postal codes (regional format)

  • Regions/states

  • Country names

  • Geographic coordinates

  • Timezones

- columns: ["street_address", "city", "postal_code"]
  type: address_generator
  params:
    locale: "fr"  # French address formats

Special Handling: The Japanese locale (ja) has dedicated address generation logic to handle the unique Japanese address system correctly.

Transformers WITHOUT Locale Support

The finance_generator transformer does not support the locale parameter. It generates financial data (credit cards, IBAN, BIC, etc.) using standard international formats.

Setting Default Locale

Configure a default locale for all transformers in your workflow:

default_config:
  locale: "en-GB"  # Set globally

tables:
  customer:
    mode: MASKING
    transformations:
      - columns: ["first_name", "last_name"]
        type: person_generator
        # Uses en-GB from default_config

Overriding Locale Per Transformer

Individual transformers can override the default locale:

default_config:
  locale: "en-US"  # Default

tables:
  customer:
    mode: MASKING
    transformations:
      - columns: ["first_name"]
        type: person_generator
        params:
          locale: "de"  # Override: German names

      - columns: ["address"]
        type: address_generator
        params:
          locale: "fr"  # Override: French addresses

Locale Format Examples

Name Generation by Locale

Locale Example First Names Example Last Names

en-US

John, Mary, James, Sarah

Smith, Johnson, Williams, Brown

en-GB

Oliver, Emma, George, Charlotte

Smith, Jones, Williams, Taylor

de

Hans, Anna, Friedrich, Maria

Müller, Schmidt, Schneider, Fischer

fr

Jean, Marie, Pierre, Sophie

Martin, Bernard, Dubois, Thomas

ja

太郎 (Tarō), 花子 (Hanako)

田中 (Tanaka), 鈴木 (Suzuki)

zh-CN

伟 (Wěi), 芳 (Fāng), 明 (Míng)

王 (Wáng), 李 (Lǐ), 张 (Zhāng)

Address Generation by Locale

Locale Example Address Format

en-US

123 Main Street, Apt 4B
Springfield, IL 62701
United States

en-GB

45 High Street
London SW1A 1AA
United Kingdom

de

Hauptstraße 123
10115 Berlin
Deutschland

fr

123 Rue de la République
75001 Paris
France

ja

東京都千代田区丸の内 1-1-1
〒100-0005
日本

Phone Number Format by Locale

Locale Example Phone Format

en-US

+1 (555) 123-4567

en-GB

+44 20 7123 4567

de

+49 30 12345678

fr

+33 1 42 86 82 00

ja

+81 3-1234-5678

Locale Selection Best Practices

  1. Match Your Target Environment: Use the locale where your test/development environment will be used

  2. Consistent Data: Use the same locale across related fields (name, address, phone) for realistic data

  3. Test Data Diversity: Consider using multiple locales to test internationalization features

  4. Default Wisely: Set a sensible default locale and only override when needed

Technical Details

The platform uses the DataFaker library (version 2.4.2) for locale-based data generation. Locales are specified using IETF BCP 47 language tags and are converted to Java Locale objects internally.