π Hi there! I'm Bin Feng, a Business Intelligence Engineer with a burning passion for all things Data Science and Machine Learning. I thrive on the thrill of exploring data, extracting insights, and turning them into actionable strategies.
π My journey in this field has been incredible, but I'm always hungry for more knowledge and skills. I firmly believe that continuous learning is the key to staying at the forefront of this dynamic industry. That's why I'm constantly seeking opportunities to sharpen my skills and delve into advanced models.
π€ Collaboration is at the heart of my work ethic. I'm eager to team up with like-minded individuals to create something truly exceptional. Whether it's a groundbreaking project or a fascinating experiment, I'm all ears for fresh ideas and open to any advice or suggestions that can elevate our work.
π‘ Let's innovate, explore, and make a positive impact together. Feel free to reach out, and let's embark on this exciting journey of data-driven discovery!
Thanks for connecting! π
The main objective for this project is to use NLP technics and build deep learning model to identify tweets about disasters using given trianing data. In this project, I used a deep learning model with LSTM (long short-term memory) networks to achieve a high accuracy model.
- train.csv - the training set
- test.csv - the test set
This dataset was created by the company figure-eight and originally shared on their βData For Everyoneβ website here.
Tweet source: (https://twitter.com/AnyOtherAnnaK/status/629195955506708480)
This data has been released under the Open Data Commons Public Domain Dedication and License.
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. URL: https://nlp.stanford.edu/pubs/glove.pdf
- Install all needed libraries using
pip install -r requirements.txt
- Download GloVe: Global Vectors for Word Representation
- If you want to use a pretrained model, you can download it in My Drive Folder
- Make sure you structure the project folder as following:
.
βββ Data
β βββ glove-global-vectors-for-word-representation
β β βββ glove.6B.200d.txt
β βββ nlp-getting-started
β βββ test.csv
β βββ train.csv
βββ my_model
β βββ assets
β βββ fingerprint.pb
β βββ keras_metadata.pb
β βββ saved_model.pb
β βββ variables
β βββ variables.data-00000-of-00001
β βββ variables.index
βββ my_model.zip
βββ requirements.txt
βββ src
βββ main.py
- In the terminal, run
python src/main.py
- Follow the instruction on the terminal, give user inputs
- Please note, retrain the model might take a while. If you just want to try it out, you can use the pretrained model instead.