Disaster-Tweets-Detection

A bit about me

πŸš€ Hi there! I'm Bin Feng, a Business Intelligence Engineer with a burning passion for all things Data Science and Machine Learning. I thrive on the thrill of exploring data, extracting insights, and turning them into actionable strategies.

πŸ“Š My journey in this field has been incredible, but I'm always hungry for more knowledge and skills. I firmly believe that continuous learning is the key to staying at the forefront of this dynamic industry. That's why I'm constantly seeking opportunities to sharpen my skills and delve into advanced models.

🀝 Collaboration is at the heart of my work ethic. I'm eager to team up with like-minded individuals to create something truly exceptional. Whether it's a groundbreaking project or a fascinating experiment, I'm all ears for fresh ideas and open to any advice or suggestions that can elevate our work.

πŸ’‘ Let's innovate, explore, and make a positive impact together. Feel free to reach out, and let's embark on this exciting journey of data-driven discovery!

Thanks for connecting! 🌟

Objectives

The main objective for this project is to use NLP technics and build deep learning model to identify tweets about disasters using given trianing data. In this project, I used a deep learning model with LSTM (long short-term memory) networks to achieve a high accuracy model.

Data

  • train.csv - the training set
  • test.csv - the test set

Acknowledgments

This dataset was created by the company figure-eight and originally shared on their β€˜Data For Everyone’ website here.

Tweet source: (https://twitter.com/AnyOtherAnnaK/status/629195955506708480)

Acknowledgements

This data has been released under the Open Data Commons Public Domain Dedication and License.

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. URL: https://nlp.stanford.edu/pubs/glove.pdf

Get Started

Prerequirements

  1. Install all needed libraries using pip install -r requirements.txt
  2. Download GloVe: Global Vectors for Word Representation
  3. If you want to use a pretrained model, you can download it in My Drive Folder
  4. Make sure you structure the project folder as following:
.
β”œβ”€β”€ Data
β”‚   β”œβ”€β”€ glove-global-vectors-for-word-representation
β”‚   β”‚   └── glove.6B.200d.txt
β”‚   └── nlp-getting-started
β”‚       β”œβ”€β”€ test.csv
β”‚       └── train.csv
β”œβ”€β”€ my_model
β”‚   β”œβ”€β”€ assets
β”‚   β”œβ”€β”€ fingerprint.pb
β”‚   β”œβ”€β”€ keras_metadata.pb
β”‚   β”œβ”€β”€ saved_model.pb
β”‚   └── variables
β”‚       β”œβ”€β”€ variables.data-00000-of-00001
β”‚       └── variables.index
β”œβ”€β”€ my_model.zip
β”œβ”€β”€ requirements.txt
└── src
    └── main.py

Use Case

  1. In the terminal, run python src/main.py
  2. Follow the instruction on the terminal, give user inputs
  3. Please note, retrain the model might take a while. If you just want to try it out, you can use the pretrained model instead.

Model Performance

image image

Additional Info