Installation

Synthesized’s SDK is generally available on PyPi which makes the installation process straight forward.

A licence key is required to use the full version of the package. If you don’t have one, a free 30-day trial licence key will be provided during the installation. See the comparison table for details about the features included in the trial.

Installing the python package

It is assumed that you have Python 3.7+ already installed on a Windows, Linux or MacOS machine.

Before starting, ensure that pip and wheel are installed and up to date.

pip install -U pip wheel

Synthesized can then be installed with pip.

pip install synthesized

Setting the licence key

Once you have installed the package, you’ll need a licence key to run the software. The quickest way to check if the SDK is working is by running the command:

synth-validate

The first time this is run you will be asked if you have a licence key. If you do not have one simply select "no" and the prompts will guide you in acquiring one by entering your email address.

Once you have set your licence key, the SDK will briefly verify the installation was successful.

With the SDK installed you are now able to get synthesizing! Check out our quickstarts or feature guides for ways that the SDK can be put to use.

Free vs Paid versions

The table below outlines the differences between the free and paid versions of the SDK. Please contact us for more information about obtaining a full licence key.

Free Paid

Licence length

30-day licence

Annual or multi-year

Maximum Dataset Size

25 columns

Unlimited

Runs offline

Data Rebalancing

Data Imputing

Data Types

String

Integer

Float

Boolean

Datetime

Timestamp

Person

Address

FormattedString

JSON

Privacy-Masking

Privacy-Preserving Synthesis

Differential privacy

Strict synthesis

Time-Series Synthesis

Regular

Event-based

Dependencies

Below are the minimum dependencies required to run the SDK.

Package Version

faker

>=8.0

matplotlib

>=3.4

numpy

>=1.19.2

pandas

<2.0, >=1.3

prompt-toolkit

>=3.0

PyYAML

>=5.2

rsa

>=4.7

rstr

>=2.2

scikit_learn

>=0.23

scipy

>=1.5

seaborn

>=0.11

synthesized_insight

>=0.5

tensorflow

>=2.6

yamale

>=4.0.4

There is no explicit limit for the size of a dataset, this is limited by the size of the RAM on the machine running the processing. More information can be found in the Benchmarks section.

The library can use a GPU but it is not required.