/NYtimes

Repository for NYtimes Project/Research

Primary LanguageR

    NYtimes corpus Analysis (In progress)

College of Media, University of Illinois at Urbana Champaign

Study on use of Machine Learning,text mining to address media research questions on Immigration w.r.to economic,political,job conditions etc.,

Contributors : Jayachandu Bandlamudi , Prof. Mike Yao

Repository for performing Named Entity Recognition,Topic Modeling ,Regression and Trend analysis on NY times news articles.

Back ground:

Immigration has always been a topic of interest to many,we may consider Immigration as a single topic but in reality there exists many sub-topics that form association with immigration such as Jobs,Politics,Economy,Time Period,International Interests,Law & Justice,Travel etc., This project is about studying the immigration in detail by considering sub topics as mentioned.For each of the sub topic we identify the influential factors such as economic ,political,social etc., by performing analysis w.r.to time

There are two steps in this study 1) Topical modelling of News articles into different sub-topics using text mining techniques, 2) Analysis on each sub-topic. Currently working on Topical modelling step using LDA , TopMine and Named entity recognition to extract Person,Location attributes in the news.

Below is the workflow for the project

workflow