/CourseraScraper

Code to scrape course reviews from Coursera and perform sentiment analysis and topic modeling

Primary LanguageJupyter Notebook

CourseraScraper

Code to scrape course reviews from Coursera and perform sentiment analysis and topic modeling

Contents

Code ready to use

  • CourseraClass.py
  • scrape_coursera_reviews.py
  • scrape_coursera_urls.py
  • Review_Sentiment_Analysis.ipynb
  • Review_Topic_Modeling.ipynb

Experiments

  • coursera_reviews_scraper.ipynb
  • coursera_url_scrapper.ipynb

Instructions

  • To scrape all course urls use scrape_coursera_urls.py
  • To scrape reviews from stored course urls in text files, use scrape_coursera_reviews.py
  • To perform sentiment analysis and topic modeling, use respective jupyter notebooks

Requires

  • Selenium
  • Pandas
  • ScikitLearn
  • BeautifulSoup
  • A stable internet connection

Notes

  • Sometimes when scrapping the whole Coursera site the server leads you to a different website than expected, so it is good to once in a while check on the browser driver if everything is going well.

Sources in code.