/cs755_process_news_data

Code that processes data needed for CS755 research project into something usable

Primary LanguagePython

Data processing for CS 755 Research Project

Description

Process the data in the "All the News" dataset so it can be used in my CNN.

Procedure

  1. Determine available news sources in set
  2. Determine bias of those new sources using Allsides.
  3. Extract sources title and label (Left/Right) according to its source
  4. Determine total amount of Left vs Right sources
  5. Make number of left vs right even!
  6. Vectorize using GLoVE word embeddings