-
Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database
python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves
python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
- To run ETL pipeline that cleans data and stores in database
-
Make sure you have libraries etc. installed
pip install -r requirements.txt
-
Run the following command in the app's directory to run your web app.
python run.py
-
Go to http://0.0.0.0:3001/
I am currently making my wy through an intensive data science certification course (Udacity). This repository is one of the main projects in that course. It involves writing an ETL pipeline, ML pipeline, then using the products to power a data dashboard deployable on Flask.
The data are real messages from various mediums collected and tagged by FigureEight. They are a great corpus for learning how to categorize messages by the topics they concern, which can then help organizations and services be routed more efficiently in a disaster scenario.
-
app
- templates
- master.html # main page of web app
- go.html # classification result page of web app
- run.py # Flask file that runs app
- templates
-
data
- disaster_categories.csv # data to process (Not included in this repo, due to size constraints)
- disaster_messages.csv # data to process (Not included in this repo, due to size constraints)
- process_data.py
- InsertDatabaseName.db # database to save clean data to
-
models
- train_classifier.py
- classifier.pkl # saved model (Not included in this repo, due to size constraints)
-
README.md
I've included the MIT license here to be as permissive as makes sense. It is my understanding that Udacity is in accordance with this licencing (they provided starter code and instruction).