This project is part of a Kaggle competition that focuses on using Natural Language Processing (NLP) to classify tweets into disaster-related or not disaster-related. The dataset consists of 10,000 hand-labeled tweets. The goal is to build a machine learning model that can accurately predict if a tweet is announcing a disaster or not.
The challenge is to distinguish between tweets that use disaster-related terms metaphorically and those that describe real events. The project evaluates the model's performance using the F1 score, which balances precision and recall.
The dataset contains two key columns:
Text: The tweet content.
Target: The classification (1 for disaster-related, 0 for not disaster-related).
The model's performance is evaluated using the F1 score.
Figure-eight for providing the dataset.