Installation#

The Synthesized SDK is a Python 3 compatible package that is installed using the provided pre-built wheel (.whl) file. Currently, wheels can be provided for Python 3.7, 3.8 and 3.9 on Windows, Linux, and MacOS x86_64 platforms. The package is also available to try on google colab https://drive-thirdparty.googleusercontent.com/32/type/application/vnd.google.colaboratory.

Note

Links to download the wheel archives can be found under each release in the changelog. You can also download the latest wheel archive here .

You will need a password in order to unzip the archive: get in contact with us for more information.

It is assumed that you have an existing Python 3 installation that is compatible with the provided wheel file. Additionally, it is recommended to work within a clean Python environment (e.g a new virtual environment with venv) to ensure the compatibility of all dependencies.

Before starting, ensure that pip, setuptools and wheel are installed and up to date:

Next, install the package using pip and the path to the wheel file:

pip install synthesized-1.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
pip install synthesized-1.10-cp38-cp38-win_amd64.whl

Setting the Licence Key#

To use the Synthesized SDK, a valid licence key is required. This “should” be set up as an environment variable, or copied to a permanent hidden folder. To set the key as an environment variable:

export SYNTHESIZED_KEY="XXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXX"
$Env:SYNTHESIZED_KEY="XXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXX"

To set the key in a permanent hidden folder:

mkdir ~/.synthesized
echo YOUR_LICENCE_KEY > ~/.synthesized/key

Testing the installation#

To verify the package has installed correctly, run the command synth-validate.

synth-validate

The licence expiry date and feature list of the licence key will be printed before running a quick test to confirm the installation was successful.

Dependencies#

Package

Supported version

faker

~=8.0

matplotlib

~=3.4

numpy

<1.20.0, >=1.18.4

pandas

~=1.1

rsa

~=4.7

rstr

~=2.2

scikit_learn

~=0.23

scipy

~=1.5

seaborn

~=0.11

synthesized_insight

~=0.5

tensorflow_estimator

~=2.6.0

tensorflow_privacy

~=0.6.0

tensorflow_probability

~=0.14.0

tqdm

~=4.61.2

tensorflow

~=2.6.2

Additional Technical Details#

There is no explicit limit for the size of a dataset, this is limited by the size of the RAM on the machine running the processing.

The library can use a GPU but it is not required.