Revealing the Omitted - An Exploration of Media Bias in the news coverage of Obamacare

Project Summery:

Employs Selenium and BeautifulSoup to scrape over 160k articles across over 8k publishers on Obamacare. Uses TF-IDF and LDA to perform topic modeling which revealed what’s theoretically omitted in a given article and systematically underrepresented at a publisher level.

File Structure Summary:

The project's files are organized into the following structure:

Code Folder - In Python - Contains IPython Jupyter notebooks which perform contain all data analysis and any custom functions built in separate python script files. (Currently in progress)

Resources Folder - All the raw and derived data used in the project. Also contains original research describing how the origin of the raw data.

Presentation file - A deck of initial results presented in early 2016.

Project Overview file - Provides a high level overview of the insights made throughout the data analysis. (Currently in progress)

README file - You're reading it. Describes logistics what things are doing and how they are organized.

zhannar/Media-Bias-NLP-Clustering

Revealing the Omitted - An Exploration of Media Bias in the news coverage of Obamacare

Project Summery:

File Structure Summary: