Text Classification

To automate the text classification process (real time) based on user input to identify the category a content belongs.

Project Flow

1.Preparing Dataset

2.Text Processing

3.LDA topic building

4.LDA visualization

5.Clustering

6.Prediction

Roadmap

  • Importing Librariers Required Basic Libraries, NLTK, Beautiful Soup

  • Web Scrapping Preparing the data sheet required,As in Sports & Politics news

  • Data Visualization

Distribution Of Document word count

Word cloud of top N words

Sentence colouring of each N sentence

  • Plotting --The number of documents for each topic by assigning the document to the topic that has the most weight in that document.

--The number of documents for each topic by summing up the actual weight contribution of each topic to respective documents.

  • Buiding of Models

Bigram & Trigram models

LDA Model

  • Prediction of NEW TEXT

  • Predicting Topic