Synthesized contains a variety of methods that can be used to assess the quality and utility of the generated synthetic data empirically and visually.

Univariate Metrics


Kolmogorov-Smirnov statistic between two continuous variables.


Earth mover's distance (aka 1-Wasserstein distance) between two nominal variables.

Multivariate Metrics


Cramér's V correlation coefficient between nominal variables.

KendallTauCorrelation([max_p_value, ...])

Kendall's Tau correlation coefficient between ordinal variables.


McFadden's pseudo R-squared coefficient between categorical and continuous variables.


Spearman's rank correlation coefficient between ordinal variables.

Modelling Metrics

predictive_modelling_score(data, y_label, ...)

Calculates the R-squared or ROC AUC score of a given model trained on a given dataset.

predictive_modelling_comparison(data, ...[, ...])

Compare the R-squared or ROC AUC score of a model trained on original data and synthetic data, and tested on hold-out sample of original data.

Plotting & Analysis


A universal set of utilities that let you to assess the quality of synthetic against original data.