News_Data_Analysis

Analyze news content from different media

How to Install/Setup

Install Python 3.7+
Create virtual environment: python -m venv venv
Activate virtual environment:
1. windows: venv/Scripts/activate
2. macOS/linux: source venv/bin/activate
Install requirements: pip install -r requirements.txt
MySQL Setup:
1. Install MySQL Server and MySQL Connector for python:
2. Create config.py. See config.example.py template for what values to put in config.py.
3. (For MySQL Server) Create database: python create_database.py
4. Connect to MySQL:

How to Run

Project not finished! Instructions will be available when project is completed.

Scrape Website

This project contains a script to scrape headlines from one of the following websites.

How to Install

python -m venv venv
Activate virtual environment:
1. windows: venv/Scripts/activate
2. macOS/linux: source venv/bin/activate
pip install -r requirements.txt

How to Run

python general_scrape.py

JS Scrape

This project contains a javascript script to scrape headlines from one of the following websites.

https://www.foxnews.com/

Note: This script may be outdated. We recommend you use the python script to scrape websites.

How to Install

install npm, node
cd repository
npm install cheerio axios

How to Run

cd scrape_web
node scrape_web/scrape1.js

Project Description

1. Scrape and clean data from website

Description of our process:

Grabbed html from a given website
Used BeautifulSoup package to parse html and find headlines

Resources that we used:

2. Store data in database

Description of our process:

Setup MySQL server (see Install section)
Used mysql-connector-python package to create database (see Install section)
Used mysql-connector-python package to insert headlines data into database

Resources that we used:

3. Analyze data

Description of our process:

Used Natural Language Processing Modules: nltk

Resources that we used:

https://github.com/cjhutto/vaderSentiment#resources-and-dataset-descriptions

4. Automate gathering and storing data into database

Description of our process:

Set up a cron job (we are running on a ubuntu machine). For windows, you may need to use task scheduler.

Resources that we used:

cjin2019/News_Data_Analysis

News_Data_Analysis

How to Install/Setup

How to Run

Scrape Website

How to Install

How to Run

JS Scrape

How to Install

How to Run

Project Description

1. Scrape and clean data from website

2. Store data in database

3. Analyze data

4. Automate gathering and storing data into database