Classification of Text Articles

Project Description
Project Files
Project Usage
Credit

📜 Project Description

This project aims to create a Natural Language Processing (NLP) model, to classify more than 2000 articles into 5 categories. The categories are Sport, Tech, Business, Entertainment and Politics.

Before developing the model, every words were assigned to a unique integer. The model then learned from this dictionary of words:numbers and relate it to the type of article during training. In this project, the model is able to classify the articles with a 90% accuracy.

A sneak peek of the model developed and model report are as below:

🗂️ Project Files

👉 classification_of_articles.py (model development file)

👉 Folder saved_models

model.h5
ohe.pkl
tokenizer.json

👉 Images folder which contains the following images:

confusion matrix
epoch accuracy and epoch loss (from tensorboard)
model accuracy
model architecture
model architecture
model parameter

👉 Logs folder (used for visualization in TensorBoard)

🚀 Project Usage

This project is done using Python 3.8 on Google Colab. This project used the following modules:

The dataset can be loaded from here
You may download all the necessary files (dataset & python files) to run the project on your device.
You can also access the file on Google Colab and run the file.

🧑‍💻 Credit

This dataset is taken from: Link

hafixah5/Article-Types-Classification

Classification of Text Articles

Table of Contents

📜 Project Description

🗂️ Project Files

🚀 Project Usage

🧑‍💻 Credit