deduplication

The file skills1_100.csv is the input file to be deduplicated The files skills1_100_learned_settings and skills1_100_training are created during deduplication

The files of the form skills1_100OutputWt[X]Thr[Y].csv are the output files generated by dedupe when the recall weight was set to X and calculated threshold is Y

The file dedupeSkills.py is the Python script used for deduplication