- Data Preparation
- Exploring the content of variables
• 2.1 Countries
• 2.2 Customers and products
- 2.2.1 Cancelling orders
- 2.2.2 StockCode
- 2.2.3 Basket price
- Insight on product categories
• 3.1 Product description
• 3.2 Defining product categories
- 3.2.1 Data encoding
- 3.2.2 Clusters of products
- 3.2.3 Characterizing the content of clusters
- Customer categories
• 4.1 Formating data
- 4.1.1 Grouping products
- 4.1.2 Time spliting of the dataset
- 4.1.3 Grouping orders • 4.2 Creating customer categories
- 4.2.1 Data enconding
- 4.2.2 Creating categories
- Classifying customers
• 5.1 Support Vector Machine Classifier (SVC)
- 5.1.1 Confusion matrix
- 5.1.2 Leraning curves • 5.2 Logistic regression • 5.3 k-Nearest Neighbors • 5.4 Decision Tree • 5.5 Random Forest • 5.6 AdaBoost • 5.7 Gradient Boosting Classifier • 5.8 Let's vote !
- Testing the predictions
- Conclusion
A project made in collaboration by:
Lavinia Popa
Lorena Bara
Lucian Anton
Mircea Vaman
Rubén Lozano