The following repo contains scripts detecting flairs from reddit posts with real time hosted flask server.
The objectives of this task is divided into five parts :
- Part - I : Collecting and Building Reddit data
- Part - II : Exploratory Data Analysis for Collected data
- Part - III : Processing and Modelling using various Algorithms
- Part - IV : Building WEB-APP
- Part - V : Deploying WEB-APP on Heroku
-
I first collected and builded Reddit India data by Scraping data using three methods :
-
Then i build my Data Intuition around the collected data analysing last four months data - Jan, Feb, Mar, Apr 2020 and digging various points out of them using various charts.
-
Then, analyszing comments of various threads , average intuitions.
-
Some of the samples of various charts and graphs are shown below :
-
January data :
- February data :
- March data
- April data
- Frequency of words occuring more often :
- Comments sentiments :
- Comments polarity :
- After that , i processed the text data using following :
- Applied CountVectorizer
- Applied TfidfVectorizer
- Constructed a Pipeline in scikit-learn for training and testing algorithms.
- After that analyszed cross validated scores on testing samples :
- Build the webapp and created API for testing on POST request.
Ex.
files = {'upload_file': open('file.txt','rb')}
r = requests.post('http://flair-reddit-predict.herokuapp.com/automated_testing', files=files)
- Now deployment of WEB-APP
- Log on to following URL hosted on pythonanywhere.com using flask server :
- Run the cmd (terminal).
- Download the project files using following command in the directory from where you need to run the script :
git clone https://github.com/souravs17031999/reddit-flair-detection
- Move to the project main directory where the project is downloaded.
- Move to directory
reddit_app
. - Now run following :
(for windows)
set FLASK_APP=app.py
python -m flask run
(for other termials)
$ export FLASK_APP=object.py
$ flask run
Other troubleshooting issues related to flask server
- Now the local server should start, log on to : [local url port] shown on terminal.
(Most probably it will be http://127.0.0.1:5000/ , or maybe any other port)
- File containing Data collection and building - Notebook1
- File containing EDA - Notebook2
- File containing Modelling - Notebook3
- Scripts for Flask APP - WebApp
- Training Data - Data
- Other data for analysing - OtherData
- Pre-trained Model - Model
- Experiment documented log contains errors and solutions - logs
⭐️ this Project if you liked it !