/DSIP-PROJECT

Primary LanguageJupyter Notebook

DSIP-PROJECT

  1. Data Preparation
  2. Exploring the content of variables • 2.1 Countries • 2.2 Customers and products
    • 2.2.1 Cancelling orders
    • 2.2.2 StockCode
    • 2.2.3 Basket price
  3. Insight on product categories • 3.1 Product description • 3.2 Defining product categories
    • 3.2.1 Data encoding
    • 3.2.2 Clusters of products
    • 3.2.3 Characterizing the content of clusters
  4. Customer categories • 4.1 Formating data
    • 4.1.1 Grouping products
    • 4.1.2 Time spliting of the dataset
    • 4.1.3 Grouping orders • 4.2 Creating customer categories
    • 4.2.1 Data enconding
    • 4.2.2 Creating categories
  5. Classifying customers • 5.1 Support Vector Machine Classifier (SVC)
    • 5.1.1 Confusion matrix
    • 5.1.2 Leraning curves • 5.2 Logistic regression • 5.3 k-Nearest Neighbors • 5.4 Decision Tree • 5.5 Random Forest • 5.6 AdaBoost • 5.7 Gradient Boosting Classifier • 5.8 Let's vote !
  6. Testing the predictions
  7. Conclusion

First checkpoint (17th of November): Chapters 1, 2, 3

Second checkpoint (15th of December)

A project made in collaboration by:

Lavinia Popa

Lorena Bara

Lucian Anton

Mircea Vaman

Rubén Lozano