/data-wrangling

WebScraping Lund University Web Page

Primary LanguageJupyter Notebook

data-wrangling | web scraping and cleaning data

scraping and cleaning data from Lund University web page

Programming language: Python

The purpose of this project was to extract and clean data from a web site into another stored format: in this case csv. The web site scraped was Lund Universities admissions page, http://www.lunduniversity.lu.se/lubas/programs, and running the code extracts all degree programs into a csv file. The education level of each program is then cleaned and categorized because the entries in the website are either inconsistent or lacking information. BeautifulSoup was used to extract the data from the universities website.

Files in repository:

  • code (list comprehension version and for-loop version)
  • image of the website
  • csv output file

*Note: The code was run on December 11, 2017. If it does not work then the website has changed the format of their code.