This project is an example project showing how to handle text in a classification machine learning algorithm. The dataset used is a free repository available at repository site
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
This project was developed using the open source Anaconda distribution with python 3.6.6 . It is highly recommended to create a virtual environment of python in your machine and install the necessary packages listed in the requirements.txt file.
(Optional, you can use it without Anaconda, you can just using a python virtual environment) Firstly, install Anaconda on your machine. Now run the following commands to create a new virtual environment with python 3.6.
conda create -n yourEnvironmentName python=3.6 anaconda
Activate the new environment and run the command below to install the necessary packages to run the system.
pip install -r requirements.txt
Now, if you run the command python training.py
the system should run and start training the Fully Connected Neural Network (NN)
The system uses the /dataset/Youtube01-Psy.csv
file to the training phase. Then, running the python loading.py
the system will give you a console interface to input the sentence to evaluate, giving us the predictions for each one.
- Anaconda - The open source distribution used.
- Python - Language used.
- Keras - Used to create the NN.
- Scikit-learn - Used to process the data and to analyse results.
- Tensorflow - Used by keras.
You can see the video tutorials: Playlist.
- Joel Carneiro - Initial work - GitHub
See also the list of contributors who participated in this project.