This is a two part jupyter notebook
This exercise was used to find what countries voted: for/against/abstained
for any given Resolution
from the United Nations
- First part of this project: Find all webpages relevant to
Session Resolutions
, this was entails entering 3 pages deep.- Second part deals with parsing
online pdf
files withJava PDFBox
and sending that data toPython
withPy4J
.- From there we will finially parse with
Regex
these files which will be plain text for our: voting by country
- From there we will finially parse with
- Second part deals with parsing
------------------------------------------------------------------------------
ALso Check out my videos: Youtube
Required installs:
pip install beautifulsoup4
| pip install pyPDF2
| pip install Selenium
Skills Learned:
- Webscraping
- Basic Regular Expressions (Regex)
- Pdf parsing onling material