/CenSyn

Data Synthesis via Decision Trees

Primary LanguagePython

CenSyn Evaluation Framework v0.8.1

  • Lead Privacy Researcher: Christine Task
  • Lead Developer: Jeffrey Hodges
  • Software Engineer: Damon Streat
  • Data Analyst: Ashley Simpson
  • Technical Writer: David Lee
  • Building / Installing the Evaluation Framework

    We recommend an Anaconda Python Environment. This package should work with any Python 3.7 setup, but Anaconda's is by far the easiest to set up.

    Python Environment

    To create a new Anaconda environment, enter the command ("censyn" can be any name you prefer):

    conda create --name censyn python=3.8

    And activate it with:

    conda activate censyn

    Building and Installing

    Run in the CenSyn root directory, after satisfying requirements.txt:

    pip install .

    Version Check and Command Line Help

    Displaying the help information will also display the version information (at the end). Call censyn at the command line with the -h flag to display argument information and current version.

    censyn -h

    Running The Synthesizer

    The following command, with the default configuration, will run a synthesis on the data, and create a Report named synthesis_report.txt and a synthetic.parquet in the output directory.

    censynthesize --synthesize_config_file conf/synthesize.cfg

    Running The Evaluation Framework

    The following command, with the default configuration, will run an evaluation of the two data sets from using the Marginal Metric evaluation method, and create a Report named report.txt in the output directory.

    censyn --eval_config_file conf/eval.cfg