/Disaster-Response-Pipeline

This is a project of the Data Science Nanodegree Program by Udacity in collaboration with Figure Eight. This project aim is to build a model to categorize massages on a real time basis.

Primary LanguageJupyter Notebook

Disaster Response Pipeline Project

Table of Contents

  1. Project Motivation
  2. Requirements
  3. Installation Instructions
  4. Files Descriptions
  5. Results
  6. Licensing, Authors, and Acknowledgements

Project Motivation

This project is part of the Data Science Nanodegree Program by Udacity in collaboration with figure Eight. The dataset contains pre-labelled tweet and messages from real life disaster events. The aim is to design a model to categorize massages on all 36 pre-defined categoties that can be sent to the appropriate disaster relief agency.

Requirements

The code should run with no issues using Python versions 3 with the following libraries:

  • Machine Learning: NumPY, Scipy, Pandas, sklearn
  • Natural Language Process: NLTK
  • SQLite Database: SQLalchemy
  • Model Loading and Saving: Pickle
  • Web App and Data Visualization: Flask, Plotly

Installation Instructions

  1. Run the following commands in the project's root directory to set up your database and model.

    • To run ETL pipeline that cleans data and stores in database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
    • To run ML pipeline that trains classifier and saves python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
  2. Run the following command in the app's directory to run your web app. python run.py

  3. Go to http://0.0.0.0:3001/ or http://localhost:3001/

File Descriptions

  • Data
    • disaster_categories.csv + disaster_messages.csv - Datasets with all the necessary informations
    • process_data.py - Code that reads and cleans the csv files and stores it in a SQL database.
    • db_disaster_messages.db - Dataset created after the transformed and cleaned data from the disasters CSV files.
  • Models
    • train_classifer.py - Code necessary to load data and run the machine learning model, this will create a pickle file at the end (classifier.pkl)
  • App
    • run.py - Flask app and the user interface used to predict results and display them.
    • templates - Folder containing the html template files

Results

This is the expected frontpage from the website: home

By inputting a sentence it should be able to see the categorie result: image

There are other options for the pipeline in the ML Pipeline Preparation.ipynb. Feel free to change the build_model() function in the train_classifier.py file (models folder)

Licensing, Authors, Acknowledgements

Must give credit to Figure Eigth for the data. Also, thank you the StackOverFlow community and Udacity for the training! Otherwise, feel free to use the code here as you would like!