Scripts and data used in creation of the doclearn project during the summer of 2016.

This involved manually tagging samples collected from random github repos. These repos were selected from the Awesome Python readme at hash dc7080d3a6236e3b652507208ac385e232efab1e.

The scripts contained in this repo are very messy.