Recursos de Aprendizaje reforzado para sistemas de recomendación.

  • Lista de papers, codigo, datasets y otros recursos relacionado con el aprendizaje reforzado y los sistemas de recomendación (Algunos incluyen el link para el PDF, código y dataset).

Artículos

[P1] Session-aware Item-combination Recommendation with Transformer Network [PDF]

Lin, T. H., & Gao, C. (2021, December). Session-aware Item-combination Recommendation with Transformer Network. In 2021 IEEE International Conference on Big Data (Big Data) (pp. 5708-5713). IEEE.

[P2] RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems [PDF]

Mladenov, M., Hsu, C., Jain, V., Ie, E., Colby, C., Mayoraz, N., Pham, H., Tran, D., Vendrov, I. & Boutilier, C. (2021). RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems. arXiv preprint arXiv:2103.08057.

[P3] A Contextual-Bandit Approach to Personalized News Article Recommendation [PDF]

Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010, April). A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web (pp. 661-670).

[P4] Partially Observable Reinforcement Learning for Dialog-based Interactive Recommendation [PDF]

Wu, Y., Macdonald, C., & Ounis, I. (2021, September). Partially Observable Reinforcement Learning for Dialog-based Interactive Recommendation. In Fifteenth ACM Conference on Recommender Systems (pp. 241-251).

[P5] Reinforcement Learning over Sentiment-Augmented Knowledge Graphs towards Accurate and Explainable Recommendation WSDM'22 [PDF]

Park, S. J., Chae, D. K., Bae, H. K., Park, S., & Kim, S. W. (2022, February). Reinforcement Learning over Sentiment-Augmented Knowledge Graphs towards Accurate and Explainable Recommendation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (WSDM'22) (pp. 784-793).

[P6] Improving Daily Deals Recommendation Using Explore-Then-Exploit Strategies [PDF]

Lacerda, A., Santos, R. L., Veloso, A., & Ziviani, N. (2015). Improving daily deals recommendation using explore-then-exploit strategies. Information Retrieval Journal, 18(2), 95-122.

[P7] Scalable explore-exploit collaborative filtering. [PDF]

Guillou, F., Gaudel, R., & Preux, P. (2016). Scalable explore-exploit collaborative filtering. In Pacific Asia Conference on Information Systems (PACIS'16).

[P8] Factorization Bandits for Interactive Recommendation. [PDF]

Wang, H., Wu, Q., & Wang, H. (2017, February). Factorization bandits for interactive recommendation. In Thirty-First AAAI Conference on Artificial Intelligence.

[P9] Bandits and Recommender Systems. [PDF]

Mary, J., Gaudel, R., & Preux, P. (2015, July). Bandits and recommender systems. In International Workshop on Machine Learning, Optimization and Big Data (pp. 325-336). Springer, Cham.

[P10] Adaptive, personalized diversity for visual discovery. [PDF]

Teo, C. H., Nassif, H., Hill, D., Srinivasan, S., Goodman, M., Mohan, V., & Vishwanathan, S. V. N. (2016, September). Adaptive, personalized diversity for visual discovery. In Proceedings of the 10th ACM conference on recommender systems (pp. 35-38).

[P11] Online clustering of bandits. [PDF]

Gentile, C., Li, S., & Zappella, G. (2014, June). Online clustering of bandits. In International Conference on Machine Learning (pp. 757-765). PMLR.

[P12] Learning diverse rankings with multi-armed bandits. [PDF]

Radlinski, F., Kleinberg, R., & Joachims, T. (2008, July). Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th international conference on Machine learning (pp. 784-791).

[P13] A Fast Bandit Algorithm for Recommendations to Users with Heterogeneous Tastes [PDF]

Kohli, P., Salek, M., & Stoddard, G. (2013, June). A fast bandit algorithm for recommendation to users with heterogenous tastes. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 27, No. 1, pp. 1135-1141).

[P14] Contextual combinatorial bandit and its application on diversified online recommendation. [PDF]

Qin, L., Chen, S., & Zhu, X. (2014, April). Contextual combinatorial bandit and its application on diversified online recommendation. In Proceedings of the 2014 SIAM International Conference on Data Mining (pp. 461-469). Society for Industrial and Applied Mathematics.

[P15] A Multiple-Play Bandit Algorithm Applied to Recommender Systems. [PDF]

Louëdec, J., Chevalier, M., Mothe, J., Garivier, A., & Gerchinovitz, S. (2015, April). A multiple-play bandit algorithm applied to recommender systems. In The Twenty-Eighth International Flairs Conference.

[P16] Top-k off-policy correction for a REINFORCE recommender system. [PDF] [Link Video Youtube]

Chen, M., Beutel, A., Covington, P., Jain, S., Belletti, F., & Chi, E. H. (2019, January). Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (pp. 456-464).

[P17] Unified conversational recommendation policy learning via graph-based reinforcement learning [PDF]

Deng, Y., Li, Y., Sun, F., Ding, B., & Lam, W. (2021, July). Unified conversational recommendation policy learning via graph-based reinforcement learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1431-1441).

[P18] When and Whom to Collaborate with in a Changing Environment: A Collaborative Dynamic Bandit Solution [PDF]

Li, C., Wu, Q., & Wang, H. (2021, July). When and Whom to Collaborate with in a Changing Environment: A Collaborative Dynamic Bandit Solution. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1410-1419).

[P19] Cluster-Based Bandits: Fast Cold-Start for Recommender System New Users [PDF]

Shams, S., Anderson, D., & Leith, D. (2021, July). Cluster-Based Bandits: Fast Cold-Start for Recommender System New Users. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1613-1616).

[P20] Comparison-based Conversational Recommender System with Relative Bandit Feedback [PDF]

Xie, Z., Yu, T., Zhao, C., & Li, S. (2021, July). Comparison-based Conversational Recommender System with Relative Bandit Feedback. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1400-1409).

[P21] A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions [PDF]

Chen, X., Yao, L., McAuley, J., Zhou, G., & Wang, X. (2021). A survey of deep reinforcement learning in recommender systems: A systematic review and future directions. arXiv preprint arXiv:2109.03540.

[P22] Reinforcement learning based recommender systems: A survey [PDF]

Afsar, M. M., Crump, T., & Far, B. (2021). Reinforcement learning based recommender systems: A survey. arXiv preprint arXiv:2101.06286.

[P23] Online Decision Transformer [PDF]

Zheng, Q., Zhang, A., & Grover, A. (2022). Online Decision Transformer. arXiv preprint arXiv:2202.05607.

[P24] Rethinking Reinforcement Learning for Recommendation: A Prompt Perspective [PDF]

Xin, X., Pimentel, T., Karatzoglou, A., Ren, P., Christakopoulou, K., & Ren, Z. (2022, July). Rethinking Reinforcement Learning for Recommendation: A Prompt Perspective. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1347-1357).

[P25] Self-Supervised Reinforcement Learning for Recommender Systems [PDF]

Xin, X., Karatzoglou, A., Arapakis, I., & Jose, J. M. (2020, July). Self-supervised reinforcement learning for recommender systems. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval (pp. 931-940).

Datasets

ContentWise impressions: an industrial dataset with impressions included [dataset repo]

Pérez Maurera, F. B., Ferrari Dacrema, M., Saule, L., Scriminaci, M., & Cremonesi, P. (2020, October). ContentWise impressions: an industrial dataset with impressions included. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (pp. 3093-3100).

Goodreads: meta-data of the books, user-book interactions (users' public shelves) and users' detailed book reviews. [dataset repo]

Wan, M., & McAuley, J. (2018, September). Item recommendation on monotonic behavior chains. In Proceedings of the 12th ACM conference on recommender systems (pp. 86-94). Wan, M., Misra, R., Nakashole, N., & McAuley, J. (2019). Fine-grained spoiler detection from large-scale review corpora. arXiv preprint arXiv:1905.13416.

Goodreads spoilers [link]

Wan, M., Misra, R., Nakashole, N., & McAuley, J. (2019). Fine-grained spoiler detection from large-scale review corpora. arXiv preprint arXiv:1905.13416.

Amazon Product Reviews (2018) [dataset repo]

He, R., & McAuley, J. (2016, April). Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In proceedings of the 25th international conference on world wide web (pp. 507-517).

Pinterest Fashion Compatibility [dataset repo]

Kang, W. C., Kim, E., Leskovec, J., Rosenberg, C., & McAuley, J. (2019). Complete the look: Scene-based complementary product recommendation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10532-10541).

Clothing Fit Data [Modcloth dataset]

Misra, R., Wan, M., & McAuley, J. (2018, September). Decomposing fit semantics for product size recommendation in metric spaces. In Proceedings of the 12th ACM Conference on Recommender Systems (pp. 422-426).

Product Exchange/Bartering Data [dataset repo]

Rappaz, J., Vladarean, M. L., McAuley, J., & Catasta, M. (2017, February). Bartering books to beers: a recommender system for exchange platforms. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (pp. 505-514). He, R., & McAuley, J. (2016, February). VBPR: visual bayesian personalized ranking from implicit feedback. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 30, No. 1).

Ambientes de Simulación

RL para publicidad en línea

Libros

[L1] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press [PDF]

[L2] Deep Learning on Graphs [PDF]

Wang, Y., Jin, W., Ma, Y., & Tang, J. (2021). Deep learning on graphs. Cambridge university press.

Cursos

[C1] Curso completo REINFORCEMENT LEARNING AND OPTIMAL CONTROL [LINK]

Dimitri P. Bertsekas, 2022