/Predict-The-News-Category-Hackathon

MachineHack is an online platform for Machine Learning competitions. We host toughest business problems that can now find solutions in Machine Learning & Data Science.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Predict-The-News-Category-Hackathon

Capture

From the beginning, since the first printed newspaper, every news that makes into a page has had a specific section allotted to it. Although pretty much everything changed in newspapers from the ink to the type of paper used, this proper categorization of news was carried over by generations and even to the digital versions of the newspaper. Newspaper articles are not limited to a few topics or subjects, it covers a wide range of interests from politics to sports to movies and so on. For long, this process of sectioning was done manually by people but now technology can do it without much effort. In this hackathon, Data Science and Machine Learning enthusiasts like you will use Natural Language Processing to predict which genre or category a piece of news will fall in to from the story.

Size of training set: 7,628 records Size of test set: 2,748 records

Features:

  • STORY: A part of the main content of the article to be published as a piece of news.
  • SECTION: The genre/category the STORY falls in.

There are four distinct sections where each story may fall in to. The Sections are labelled as follows:

  • Politics: 0
  • Technology: 1
  • Entertainment: 2
  • Business: 3

Metric

The final score will be calculated based on the number of true predictions using the confusion matrix.

Leaderboard

Rank: 2

Score: 0.99163027660

Notes

File Score
predict-the-news-category_v5.ipynb 0.99053857
predict-the-news-category_v9.ipynb 0.99017467
predict-the-news-category_v8.ipynb 0.98944687
final-ensemble.ipynb 0.99163028