This is a web application designed to show the project structure for a machine learning model deployed using flask. This project features a machine learning model that has been trained to detect whether or not an online comment is a Cyber-Troll
or Non Cyber-Troll
. This application acts as an interface for a user to submit new queries. The machine learning model was built using various features of scikit learn:
- Support Vector Machine (SVM)
- Bag-of-Words text representation (BoW)
- Grid Search + Cross Validation
Each of these components are developed within the project in an offline setting inside /model_dev
. The SVM and BoW models will still be needed in a production or testing setting in order to be able to predict user-submitted queries, so they can be serialized via python's pickle functionality and stored within the /model_assets
folder.
In order to detect whether or not an online comment is from a cyber troll, you can deploy this application locally and submit queries to the machine learning model to recieve predictions through a simple user interface. The model was trained using the Dataset for Detection of Cyber-Trolls (see here). This project emphasizes more the development process of creating deploy-friendly machine learning projects, rather than the creating of the predictive model itself.
The model development notebook is located here.
You can also find a blog post that accompanies this repo here.
Note that this project is still in progress
First clone the repo locally.
git clone https://github.com/wgopar/ml-flask-web-app.git
Create a new virtual environment in the project directory.
python3 -m venv ./venv
Activate the virtual environment.
source venv/bin/activate
While in the virtual environment, install required dependencies from requirements.txt
.
pip install -r ./requirements.txt
Now we can deploy the web application via
python app.py
and navigate to http://127.0.0.1:5000/
to see it live. On this page, a user can then submit text into the text
field and receive predictions from the trained model and determine if the text most likely came from a Cyber Troll
or
Non Cyber-Troll
.
The application may then be terminated with the following commands.
$ ^C # exit flask application (ctrl-c)
$ deactivate # exit virtual environment
ml-flask-web-app
├── model_assets
│ ├── model.pkl
│ └── vectorizer.pkl
├── model_dev
│ ├── data
│ | └── data.json
│ └── model_dev.ipynb
├── templates
│ └── index.html
├── app.py
├── utils.py
├── requirements.txt
└── README.md
/model_assets
is used to store persisted states of the predictive model and learned feature extractors from scikit-learn.
/model_dev
is used as the model development playground where an .ipynb
is used to develop the model and save new versions of persisted states.
Storing new persisted states of the model can be done within the jupyter notebook. As an example, within model_dev.ipynb
I can create a new model/retrain and include in into the ./model_assets
folder when I am satisfied. A simple example:
import utils
clf = LogisticRegression()
clf.fit(X_train, y_train)
utils.persist_model(clf, description='clf_v.0.0') # creates clf_v.0.0.pkl in /model_assets folder
Selecting the version of models to use during run time is chosen within the POST request function inside
in app.py
.
/templates
holds the html templates for the application.
[]: './static/screen-shot-ui.png'