/Product-recommendation-hackathon

Product recommendation, Capillary ML hackathon hosted on Analytics Vidhya | rank - 27

Primary LanguageJupyter Notebook

Capillary Machine Learning Hackathon

competition link

Mean average precison @ 10 (MAP@10) : 0.0303297169

Private Leader Board Rank : 27

Analysis and data Preparation Notebook

Final model Notebook

Libraries used:

  • pandas 0.22.0
  • numpy 1.14.6
  • matplotlib 3.0.2
  • seaborn 0.7.1
  • implicit 0.3.8
  • keras 2.2.4
  • opencv-python 3.4.5.20
  • Pillow 4.0.0

Problem Statement Data dictionary Evaluation Metric

Solution

  • Here in this competition we are only evaluating model based on existing customers.
  • We have more than 25000 users and 3000 products.
  • We have to recommend (predict) top 10 products which a user is going to buy in last two months.
  • Images of all the product and their attributes are given.

Features

  • Created features for all product based on their attribute values. Total 243 features sparse features are created.
  • From Keras used DenseNet121 ImageNet pretrained model to create features from product images. Total 256 features.
  • So, now product has total 499 features.

Approach / Models

  • Used 2 methods,
  1. Collaborative filtering
  • Based on user-user similarity matrix and cosine similarity predicted score for all user-product pair
  1. Content based filtering
  • All the product related attributte features and Image feature are used.

  • First user profile matrix has to be created, so that we can compare that to all the products and find best similar product for user.

  • Weighted average of all the product features are calculated to make profile vector for a user, whcih is based on how many times perticular product is bought by the user.

  • Now based on cosine similarity between user profile vector and all product vector, find best 10 similar products.

  • Content based filtering gave higher score than user-user collaborative filtering.

  • This indicates that user are more likely to buy products similar to what they have already bought.

  • Tried Hybrid approach (Combine collaborative and content based filtering), but it didn't worked, so I have used content based filtering model as a final model.