applied-ml
Curated papers, articles, and blogs sharing how data science & machine learning is applied in production. ⚙️
Have a favourite piece you're not seeing here? Want to contribute? Make a pull request! 😄
Table of Contents
- Data Quality
- Data Engineering
- Classification
- Regression
- Recommendation
- Search/Ranking
- Natural Language Processing
- Sequence Modelling
- Computer Vision
- Reinforcement Learning
- Anomaly Detection
- Graph
- Optimization
- Information Extraction
- Validation and A/B Testing
- Practices
Data Quality
- Monitoring Data Quality at Scale with Statistical Modeling
Uber
- An Approach to Data Quality for Netflix Personalization Systems
Netflix
Data Engineering
- Zipline: Airbnb’s Machine Learning Data Management Platform
Airbnb
- Sputnik: Airbnb’s Apache Spark Framework for Data Engineering
Airbnb
- Feast: Bridging ML Models and Data
Gojek
Classification
- High-precision phrase-based document classification on a modern scale
LinkedIn
- Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing
WalmartLabs
- Large-scale Item Categorization for e-Commerce
DianPing
,eBay
- Categorizing Products at Scale
Shopify
- Learning to Diagnose with LSTM Recurrent Neural Networks
Google
- Discovering and Classifying In-app Message Intent at Airbnb
Airbnb
- How we built the good first issues feature
GitHub
Regression
- Using Machine Learning to Predict Value of Homes On Airbnb
Airbnb
- Modeling conversion rates and saving millions of dollars using Kaplan-Meier and gamma distributions
Better
- Using machine learning to predict the value of ad requests
Twitter
Recommendation
- Amazon.com Recommendations: Item-toItem Collaborative Filtering
Amazon
- Recommending Complementary Products in E-Commerce Push Notifications with a Mixture Model Approach
Alibaba
- Behavior Sequence Transformer for E-commerce Recommendation in Alibaba
Alibaba
- Session-based Recommendations with Recurrent Neural Networks
Telefonica
- Deep Neural Networks for YouTube Recommendations
YouTube
- Personalized Recommendations for Experiences Using Deep Learning
TripAdvisor
- E-commerce in Your Inbox: Product Recommendations at Scale
Yahoo
Product Recommendations at Scale](https://arxiv.org/abs/1606.07154)Yahoo
- Powered by AI: Instagram’s Explore recommender system
Facebook
- Artwork Personalization at Netflix
Netflix
- To Be Continued: Helping you find shows to continue watching on Netflix
Netflix
- Learning a Personalized Homepage
Netflix
- https://eng.uber.com/uber-eats-graph-learning/
Uber
- Calibrated Recommendations
Netflix
- Explore, Exploit, and Explain: Personalizing Explainable Recommendations with Bandits
Spotify
- For Your Ears Only: Personalizing Spotify Home with Machine Learning
Spotify
- Reach for the Top: How Spotify Built Shortcuts in Just Six Months
Spotify
- The Evolution of Kit: Automating Marketing Using Machine Learning
Shopify
- Using machine learning to predict what file you need next (Part 1)
Dropbox
- Using machine learning to predict what file you need next (Part 2)
Dropbox
- A closer look at the AI behind course recommendations on LinkedIn Learning (Part 1)
LinkedIn
- A closer look at the AI behind course recommendations on LinkedIn Learning (Part 2)
LinkedIn
- A recommender system in 30 lines of Clojure
Findka.com
Search/Ranking
- Amazon Search: The Joy of Ranking Products
Amazon
- How Lazada Ranks Products to Improve Customer Experience and Conversion
Lazada
- Using Deep Learning at Scale in Twitter’s Timelines
Twitter
- Machine Learning-Powered Search Ranking of Airbnb Experiences
Airbnb
- Applying Deep Learning To Airbnb Search
Airbnb
- Ranking Relevance in Yahoo Search
Yahoo
- An Ensemble-Based Approach to Click-Through Rate Prediction for Promoted Listings at Etsy
Etsy
- Why Do People Buy Seemingly Irrelevant Items in Voice Product Search?
Amazon
- The AI Behind LinkedIn Recruiter search and recommendation systems
LinkedIn
- AI at Scale in Bing
Microsoft
- Query Understanding Engine in Traveloka Universal Search
Traveloka
- Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction
Alibaba
- The Secret Sauce Behind Search Personalisation
GoJek
Embeddings
- Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba
Alibaba
- Embeddings@Twitter
Twitter
- Listing Embeddings in Search Ranking (Paper)
Airbnb
Natural Language Processing
- Abusive Language Detection in Online User Content
Yahoo
- How natural language processing helps LinkedIn members get support easily
LinkedIn
- Building Smart Replies for Member Messages
LinkedIn
- Smart Reply: Automated Response Suggestion for Email
Google
- Assistive AI Makes Replying Easier
Microsoft
- AI advances to better detect hate speech
Facebook
- Using Neural Networks to Find Answers in Tables
Google
- A Scalable Approach to Reducing Gender Bias in Google Translate
Google
- A state-of-the-art open source chatbot
Facebook
- Goal-Oriented End-to-End Conversational Models with Profile Features in a Real-World Setting
Amazon
- How Gojek Uses NLP to Name Pickup Locations at Scale
GoJek
Sequence Modelling
- Recommending Complementary Products in E-Commerce Push Notifications with a Mixture Model Approach
Alibaba
- Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction
Alibaba
- Learning to Diagnose with LSTM Recurrent Neural Networks
Google
- Deep Learning for Understanding Consumer Histories
Zalando
- Continual Prediction of Notification Attendance with Classical and Deep Network Approaches
Telefonica
Computer Vision
- Categorizing Listing Photos at Airbnb
Airbnb
- Amenity Detection and Beyond — New Frontiers of Computer Vision at Airbnb
Airbnb
- Powered by AI: Advancing product understanding and building new shopping experiences
Facebook
- Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning
Dropbox
- How we improved computer vision metrics by more than 5% only by cleaning labelling errors
Deepomatic
- A Neural Weather Model for Eight-Hour Precipitation Forecasting
Google
- Converting text to images for product discovery
Amazon
Reinforcement Learning
- Deep Reinforcement Learning for Sponsored Search Real-time Bidding
Alibaba
- Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning
Alibaba
- Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising
Alibaba
- Productionizing Deep Reinforcement Learning with Spark and MLflow
Zynga
Anomaly Detection
- Detecting Performance Anomalies in External Firmware Deployments
Netflix
- Detecting and preventing abuse on LinkedIn using isolation forests
LinkedIn
- Uncovering Insurance Fraud Conspiracy with Network Learning
Ant Financial
Graph
Optimization
Information Extraction
- Unsupervised Extraction of Attributes and Their Values from Product Description
Rakuten
- Information Extraction from Receipts with Graph Convolutional Networks
Nanonets
Validation and A/B Testing
- The reusable holdout: Preserving validity in adaptive data analysis
Google
- A/B Testing with Hierarchical Models in Python
Domino
- Detecting interference: An A/B test of A/B tests
LinkedIn
- Building inclusive products through A/B testing
LinkedIn
- Experimenting to solve cramming
Twitter
- Announcing a New Framework for Designing Optimal Experiments with Pyro
Uber
- Enabling 10x More Experiments with Traveloka Experiment Platform
Traveloka
Practices
- Practical Recommendations for Gradient-Based Training of Deep Architectures
Yoshua Bengio
- Machine Learning: The High Interest Credit Card of Technical Debt
Google
- Rules of Machine Learning: Best Practices for ML Engineering
Google
- Hidden Technical Debt in Machine Learning Systems
Google
- On Challenges in Machine Learning Model Management
Amazon
- 150 successful Machine Learning models: 6 lessons learned at Booking.com
Booking.com