- This Use-case lets quora to avoid the same/repetive questions to be answered over and over again. There by increasing customer experience.
- Application is an binary classification where we've two questions entered in the form and application classifies/predicts whether the given two questions are duplicate or not!
- Dataset link: https://www.kaggle.com/c/quora-question-pairs/data.
- Train data: 4,04,290 are the pairs of quora question.
- Trained using Random Model, Logistic Regression, Linear SVM, & xgboost Machine Learning Models.
- xgboost offered accuracy: train-logloss: 0.34470 | valid-logloss: 0.35853.
- Evaluation Metrics: Log loss & Confusion matrix .
Machine Learning | Python | AJAX | Bootstrap | VS CODE IDE 💻
Note: Not to worry if few files are missing in the repo, those file can be generated by running python source files.
- Clone/Download the project from github.
- create new conda/python environment, activate the environment created and install application dependencies via terminal command in root directory of the application by "pip install -r requirments.txt".
- Run 0.8_prediction.py file.