generator.py: generate unix script to submit and run the scrapping job on the cluster
web_crawler.py: the main program to scrape data from the website taking the date as argument from stdin
disambiguity.py: identify ambiguous username and restore the user's identity