retail-personalization-workshop
Code for the In-Session Personalization Workshop for eCommerce, April 2021, and the MICES Workshop in June 2021.
Overview
This repo hosts the code for the In-Session Personalization workshop at the Machine Learning in Retail Summit and the hands-on workshop at MICES - please note the ML notebook is a proper subset of the MICES workshop, and it is kept in the repo as a reference: new users should just run the MICES version. Both workshops are hands-on meetings on in-session personalization, including slides and this open source repository: our aim is to implement a sound and readable version of the models found in our research papers, showcasing tried and tested personalization strategies on real e-commerce data.
While the code is heavily commented, please refer to the slides and the references below for the full context behind the product features and some design choices.
How to run the code
Setup
Make sure to install all the required dependencies, as listed in the requirements.txt
file. Launch jupyter notebook
and run the code as a standard notebook.
Data
The code works out of the box with the real-world dataset shared by Coveo for the SIGIR Data Challenge 2021: download the data, put it in a local folder, and then change the LOCAL_FOLDER
variable in the notebook to point to the train
folder in your computer. Remember that use of the Coveo dataset implies the acceptance of the accompanying T&C.
If you wish to use your own e-commerce data, all the "ML code" can be kept intact, as long as you replace the functions devoted to prepare session data from the training set.
Contacts
For questions related to tools and models, or if you wish to organize a similar workshop together, please contact me.
References
The theory of in-session personalization through product spaces and the related use-cases have been developed, tested and benchmarked at Coveo AI in several research papers during 2020. In particular:
- "An Image is Worth a Thousand Features": Scalable Product Representations for In-Session Type-Ahead Personalization
- Fantastic Embeddings and How to Align Them: Zero-Shot Inference in a Multi-Shop Scenario
- Shopping in the Multiverse: A Counterfactual Approach to In-Session Attribution
- The Embeddings That Came in From the Cold: Improving Vectors for New and Rare Products with Content-Based Inference
- How to Grow a (Product) Tree: Personalized Category Suggestions for eCommerce Type-Ahead
If you find this workshop (and code) useful, please remember to cite our work!
Acknowledgments
Patrick John Chia helped co-authoring the session and preparing the materials. We also wish to thank our co-authors, which co-developed some of the models and ideas we presented. In particular:
- Ciro Greco - Coveo AI Labs
- Federico Bianchi - Postdoctoral Researcher at UniversitĂ Bocconi
- Bingqing Yu - Coveo
Finally, the authors wish to thank Coveo for supporting our research, and Luca Bigon for help in data collection and preparation.
How to Cite our Work
If you find this code and dataset useful, please cite our work:
@inproceedings{CoveoSIGIR2021,
author = {Tagliabue, Jacopo and Greco, Ciro and Roy, Jean-Francis and Bianchi, Federico and Cassani, Giovanni and Yu, Bingqing and Chia, Patrick John},
title = {SIGIR 2021 E-Commerce Workshop Data Challenge},
year = {2021},
booktitle = {SIGIR eCom 2021}
}
License
All code is provided "as is" under a standard MIT License.