/scraper-cumincadPDF

Bach download PDFs from CuminCAD for each series

Primary LanguagePythonMIT LicenseMIT

# scraper-cumincadPDF

Batch download PDFs from CuminCAD for each series e.g ACADIA

Prerequisite

  • BeautifulSoup (type on your Command Prompt 'pip install Beautifulsoup')

To execute the code

  • Choose the series you want to scrape e.g "ACADIA".
  • You also have to create a folder for your series in the same place the python code is saved.
    For example C:\Users\yourName\Desktop\scraper-cumincadPDF\ACADIA
  • Go to cumincad.org and check what is the total number of papers in that series.
    Then assign this value to totalNumberPapers

You are good to go.

If the code stops and you want to continue from the last downloaded paper

Go to the series folder your created earlier and see how many files are there.
Assign this value to papersAlreadyDownloaded.

In case the code doesn't find a pdf for the paper

Sometimes the pdf is missing, or simply there is no pdf at all. In this case, the code generates a .txt file with the information available.

Disclaimer

This tool was developed for studying purposes.
By using it you agree you are aware that by reducing or eliminating the waiting time from the code you can potentially cause flooding requests and damage the website's performance!
I do not assume any responsabilities for its use.