Revealing the Omitted - An Exploration of Media Bias in the news coverage of Obamacare
Project Summery:
Employs Selenium and BeautifulSoup to scrape over 160k articles across over 8k publishers on Obamacare. Uses TF-IDF and LDA to perform topic modeling which revealed what’s theoretically omitted in a given article and systematically underrepresented at a publisher level.
File Structure Summary:
The project's files are organized into the following structure:
Code Folder - In Python - Contains IPython Jupyter notebooks which perform contain all data analysis and any custom functions built in separate python script files. (Currently in progress)
Resources Folder - All the raw and derived data used in the project. Also contains original research describing how the origin of the raw data.
Presentation file - A deck of initial results presented in early 2016.
Project Overview file - Provides a high level overview of the insights made throughout the data analysis. (Currently in progress)
README file - You're reading it. Describes logistics what things are doing and how they are organized.