XML Generator

Generate realistic, schema-valid XML documents from an XSD definition, with optional customisation of individual field values.

Overview

The XML generator takes an XSD schema and produces XML documents that are structurally valid against it. By default it fills every field with synthetic values that respect the constraints in your schema (patterns, numeric ranges, enumerations, etc.). You can then layer on a YAML customisation file to override specific fields with the exact generator behaviour you need.

Built-in Type Handling

XSD restrictions are respected automatically — no customisation required for basic constraints:

Facet Effect

xs:pattern

Generated strings match the regular expression

xs:minLength / xs:maxLength

String lengths are bounded

xs:minInclusive / xs:maxInclusive

Numeric and date values stay within the declared range

xs:enumeration

Only the listed values are generated

xs:minOccurs / xs:maxOccurs

Optional elements are sometimes omitted; repeated elements vary in count

For example, a YesNo element restricted to yes|no is handled without any customisation file — the generator infers that only those two values are valid.

YAML Customisations

When you need control beyond what the schema constraints provide, supply a YAML customisation file. This lets you apply any TDK generator to specific elements or attributes, link related fields together, or suppress optional elements entirely.

Syntax

"<xpath-expression>":
  transformation:
    type: <generator-type>
    <generator-params>

Each key is an XPath expression that selects what to replace. Values must be text nodes (use /text()) or attributes (use @name) — selecting an element node has no effect because the generator only replaces leaf values, not structural nodes.

Targeting text content

"FirstName/text()":
  transformation:
    type: constant_string
    value: "Alice"

Targeting attributes

"@resident":
  transformation:
    type: formatted_string_generator
    pattern: "yes|no"

Nested paths

"Customer/Address/PostCode/text()":
  transformation:
    type: formatted_string_generator
    pattern: "[A-Z]{2}\\d{1,2} \\d[A-Z]{2}"

Generator Types

Constant values

"Status/text()":
  transformation:
    type: constant_string
    value: "ACTIVE"

"Count/text()":
  transformation:
    type: constant_numeric
    value: 42
    numeric_type: INT     # INT | LONG | FLOAT | DOUBLE

Pattern-based strings

"PhoneNumber/text()":
  transformation:
    type: formatted_string_generator
    pattern: "\\+44\\d{10}"

Numeric distribution

"Age/text()":
  transformation:
    type: continuous_generator
    mean: 35
    std: 12
    min: 18.0
    max: 90.0
    numeric_type: INT

Integer sequence

"@id":
  transformation:
    type: int_sequence_generator
    start_from: 1000

Remove an element (void)

void_generator causes the selected element to be omitted from the output entirely:

"OptionalField/text()":
  transformation:
    type: void_generator

Null value

"@optionalAttr":
  transformation:
    type: null_generator

Foreign key — reference another element’s value

Use foreign_key_generator to copy a value from an ancestor element, keeping related fields consistent across the document:

"Customer/@CustomerID":
  transformation:
    type: formatted_string_generator
    pattern: "\\d{1,8}"

"Customer/Order/@CustomerOrderID":
  transformation:
    type: foreign_key_generator
    referred_table: "ancestor::Customer"   # XPath to the source element
    referred_fields:
      - "@CustomerID"                       # which attribute/text to copy

Conditional generator

Produce different values depending on what another field contains:

"Customer/Type/@Code":
  transformation:
    type: constant_string
    value: "Fake"

"Customer/Order/@SalesOrderID":
  transformation:
    type: conditional_generator
    conditional_table: "ancestor::Customer/Type"   # XPath to the element to check
    conditional_column: "@Code"                    # attribute or text() to compare
    conditional_value: "Fake"                      # the value to match against
    reference_columns:
      - "@SalesOrderID"
    if_true:
      type: constant_numeric
      value: 1
      numeric_type: INT
    if_false:
      type: int_sequence_generator
      start_from: 1

if_true / if_false accept any generator type, including null_generator and void_generator.

Full Example

Given a simple person.xsd with FirstName, LastName, Age elements and a resident attribute, a customisation file might look like:

"FirstName/text()":
  transformation:
    type: formatted_string_generator
    pattern: "(Alice|Bob|Charlie)\\d{1,4}"

"LastName/text()":
  transformation:
    type: formatted_string_generator
    pattern: "(Smith|Jones|Taylor)\\d{1,4}"

"Age/text()":
  transformation:
    type: continuous_generator
    mean: 35
    std: 12
    min: 18.0
    max: 90.0
    numeric_type: INT

"@resident":
  transformation:
    type: formatted_string_generator
    pattern: "yes|no"

Troubleshooting

Problem Resolution

Customisation has no effect

Check that your XPath selects a text node (/text()) or attribute (@name), not an element

Failed to parse the configuration

Validate the YAML syntax; every entry must have a transformation block with a type field

Ambiguous foreign keys

Provide explicit referred_table XPath expressions to disambiguate which ancestor element to use