/confcrawler

A crawler for academic conferences. A dataset of 5 academic conferences is also provided

Primary LanguagePython

A crawler for academic confereces. All papers in AAAI, NIPS, ACL, AISTATS and AIIDE from year 2010 to 2016 are crawled. For academic use only.

Install prerequisites:

pip install -r requirements.txt

Install the project:

python setup.py develop --user --record files.txt

Uninstall the project:

cat files.txt | xargs rm -rf

Prepare dataset: (depending on your network status, it will typically take 1-3 hours)

python confcrawler/util.py

To direct download datasets, run the following commands:

cd data; ./get_data.sh

Or visit the website: http://web.stanford.edu/~huizi/confdata/

To custom year range of the dataset, edit the year range in confcrawler/util.py.

To add another conference, you will need to add your own crawler in crawl.py, and add it to the ClassDict.