Synthesized’s Documentation
Synthesized provides the ability to generate high quality structured synthetic data.

We can consider three different stages of the synthesis process which connect data to the information it represents.
-
Analysis - creating a description/understanding from a data.
-
Augmentation - modifying a description with another description.
-
Curation - creating data from a given description/understanding.
The functionality offered can be considered at two different levels, which correspond to different scales and use cases of data.
Synthesized’s Data Kits
Scientific Data Kit
The SDK generates high quality, privacy-preserving datasets for machine learning and data science use cases.
-
Bootstrap datasets
-
Rebalance and impute missing values
-
Create privacy-preserving data
Test Data Kit
For very large test data databases with complicated primary-foreign key relationships, our TDK is the tool to use.
-
Maintain referential integrity
-
Subset and mask databases
-
Generate privacy-preserving replicas
Community support for free versions of SDK and TDK is available here.