/Online-shoppers-intention-prediction

Predict the intention (make a purchase or not) of e-commerce website visitors.

Primary LanguageJupyter NotebookMIT LicenseMIT

Online-shoppers-intention-prediction

Predict the intention (make a purchase or not) of e-commerce website visitors.

Data Source:

http://archive.ics.uci.edu/ml/datasets/Online+Shoppers+Purchasing+Intention+Dataset

Data Description:

The dataset consists of feature vectors belonging to 12,330 sessions. The dataset was formed so that each session would belong to a different user in a 1-year period to avoid any tendency to a specific campaign, special day, user profile, or period. Of the 12,330 sessions in the dataset, 84.5% (10,422) were negative class samples that did not end with shopping, and the rest (1908) were positive class samples ending with shopping.

Numerical features
Feature name Feature description Min. val Max. val SD
Admin. #pages visited by the visitor about account management 0 27 3.32
Ad. duration #seconds spent by the visitor on account management related pages 0 3398 176.70
Info. #informational pages visited by the visitor 0 24 1.26
Info. durat. #seconds spent by the visitor on informational pages 0 2549 140.64
Prod. #pages visited by visitor about product related pages 0 705 44.45
Prod.durat. #seconds spent by the visitor on product related pages 0 63,973 1912.3
Bounce rate Average bounce rate value of the pages visited by the visitor 0 0.2 0.04
Exit rate Average exit rate value of the pages visited by the visitor 0 0.2 0.05
Page value Average page value of the pages visited by the visitor 0 361 18.55
Special day Closeness of the site visiting time to a special day 0 1.0 0.19
Categorical features
Feature name Feature description Number of Values
OperatingSystems Operating system of the visitor 8
Browser Browser of the visitor 13
Region Geographic region from which the session has been started by the visitor 9
TrafficType Traffic source (e.g., banner, SMS, direct) 20
VisitorType Visitor type as “New Visitor,” “Returning Visitor,” and “Other” 3
Weekend Boolean value indicating whether the date of the visit is weekend 2
Month Month value of the visit date 12
Revenue Class label: whether the visit has been finalized with a transaction 2

Project Goal

The main goal of this project is to design a machine learning classification system, that is able to predict an online shopper's intention ( buy or no buy ), based on the values of the given features (from google analytics). A number of different classification algorithms is tested, in order to pick the best one for the project.

Conclusion

In this project, we used Online Shoppers Intention dataset to build models that can classify website visitor, and predict which of them is likely going to make a purchase on the website. 7 different learning classifiers (Naive Bayes, KNN, SVM, Logistic Regression, Random Forest, Gradient Boosting, and Adaboosting) were tested and optimized, and we have achieved the best classification performance using Gradient Boost classifier, followed by random Forest, and then Adaboost.

Classifier Accuracy F1-Score Precision Recall
Naive Bayes 0.775 0.491 0.394 0.652
KNN 0.873 0.506 0.723 0.39
SVM 0.889 0.6 0.751 0.5
Logistic Regression 0.879 0.529 0.758 0.406
Random Forest 0.902 0.662 0.774 0.578
Gradient Boost 0.905 0.689 0.761 0.63
AdaBoost 0.889 0.624 0.713 0.555