e# Awesome Search
I've been building e-commerce search applications for almost ten years. Below you can find a list of (some) publications, conferences and books that inspire me. Grouped by topic (If an article fits into multiple topics - it goes into multiple sections).
⭐ Star us on GitHub — it helps!
🔔 Start watching the repo or follow the atom feed to receive updates.
- General, fun, philosophy
- Types of search
- Search UX
- Baymard Institute
- Nielsen Norman Group
- Enterprise Knowledge LLC
- Facets
- Accidental Taxonomist
- Other
- Spelling correction
- Suggestions
- Synonyms
- Graphs/Taxonomies/Knowledge Graph
- Integrating Search and Knowledge Graphs (by Enterprise Knowledge)
- Query understanding
- Search Intent
- Query segmentation
- Algorithms
- Relevance Algorithms
- Learning to Rank
- Click models for search
- BERT
- Collocations, common phrases
- Other Algorithms
- Tracking, profiling, GDPR, Analysis
- Testing, metrics, KPIs
- Metrics
- KPIs
- Evaluating Search (by Daniel Tunkelang)
- Measuring Search (by James Rubinstein)
- Three Pillars of Search Relevancy (by Andreas Wagner)
- Blogs and Portals, News
- Conferences
- Books
- Management, Search Team
- Job Interviews
- Industry players
- Personalies and influencers
- Search Engines
- Products and services
- Consulting companies
- Blogposts series
- Search Optimization 101 (by Charlie Hull)
- Query Understanding (by Daniel Tunkelang)
- Grid Dynamics
- Videos
- Usecases
- Tools
- Falsehoods Programmers Believe About Search
- Ethical Search: Designing an irresistible journey with a positive impact
- Humans Search for Things not for Strings
- On Semantic Search
- Feedback debt: what the segway teaches search teams
- Supporting the Searcher’s Journey: When and How
- Shopping is Hard, Let’s go Searching!
- An Introduction to Search Quality
- What is a ‘Relevant’ Search Result?
- On-Site Search Design Patterns for E-Commerce: Schema Structure, Data Driven Ranking & More
- In Search of Recall
- Etsy. Targeting Broad Queries in Search
- How Etsy Uses Thermodynamics to Help You Search for “Geeky”
- Broad and Ambiguous Search Queries
- Deconstructing E-Commerce Search: The 12 Query Types
- Deconstructing E-Commerce Search: The 12 Query Types
- Autodirect or Guide Users to Matching Category
- 13 Design Patterns for Autocomplete Suggestions (27% Get it Wrong)
- E-Commerce Search Needs to Support Users’ Non-Product Search Queries (15% Don’t)
- Search UX: 6 Essential Elements for ‘No Results’ Pages
- Product Thumbnails Should Dynamically Update to Match the Variation Searched For (54% Don’t)
- Faceted Sorting - A New Method for Sorting Search Results
- The Current State of E-Commerce Search
- E-Commerce Sites Need Multiple of These 5 ‘Search Scope’ Features
- E-Commerce Search Field Design and Its Implications
- E-Commerce Sites Should Include Contextual Search Snippets (96% Get it Wrong)
- E-Commerce Search Usability: Report & Benchmark
- Six ‘COVID-19’ Related E-Commerce UX Improvements to Make
- The Love-at-First-Sight Gaze Pattern on Search-Results Pages
- Good Abandonment on Search Results Pages
- Complex Search-Results Pages Change Search Behavior: The Pinball Pattern
- Site Search Suggestions
- Search-Log Analysis: The Most Overlooked Opportunity in Web UX Research
- Scoped Search: Dangerous, but Sometimes Useful
- 3 Guidelines for Search Engine "No Results" Pages
- Facets of Faceted Search
- Coffee, Coffee, Coffee!
- Faceted Search
- How to implement faceted search the right way
- Metadata and Faceted Search
- How Many Facets Should a Taxonomy Have
- When a Taxonomy Should not be Hierarchical
- Customizing Taxonomy Facets
- Learning from Friction to Improve the Search Experience
- Why is it so hard to sort by price?
- Faceted Sorting
- Peter Norvig. "How to Write a Spelling Corrector". Classic publication.
- Daniel Tunkelang. "Spelling Correction"
- A simple spell checker built from word vectora
- A closer look into the spell correction problem: 1, 2, 3, preDict
- Deep Spelling
- Modeling Spelling Correction for Search at Etsy
- Wolf Garbe. Author of Sympell. 1000x Faster Spelling Correction algorithm, Top highlight SymSpell vs. BK-tree: 100x faster fuzzy string search & spell checking, Fast Word Segmentation of Noisy Text
- Chars2vec: character-based language model for handling real world texts with spelling errors and
- JamSpell, spelling correction taking into account surrounding context - library, (in russian) Исправляем опечатки с учётом контекста
- Embedding for spelling correction
- A simple spell checker built from word vectors
- What are some algorithms of spelling correction that are used by search engines?
- Moman - lucene/solr/elasticsearch spell correction/autocorrect is (was?) actually powered by this library.
- Query Segmentation and Spelling Correction
- Applying Context Aware Spell Checking in Spark NLP
- Autocorrect in Google, Amazon and Pinterest and how to write your own one
- Boosting the power of Elasticsearch with synonyms
- Real Talk About Synonyms and Search
- Synonyms in Solr I — The good, the bad and the ugly
- Synonyms and Antonyms from WordNet
- Synonyms and Antonyms in Python
- Dive into WordNet with NLTK
- Creating Better Searches Through Automatic Synonym Detection
- Multiword synonyms in search using Querqy
- How to Build a Smart Synonyms Model
Synonyms: autocomplete, search as you type, suggestions
- Giovanni Fernandez-Kincade. Bootstrapping Autosuggest, Building an Autosuggest Corpus, Part 1, Building an Autosuggest Corpus, Part 2, Autosuggest Retrieval Data Structures & Algorithms, Autosuggest Ranking
- On two types of suggestions
- Improving Search Suggestions for eCommerce
- Autocomplete Search Best Practices to Increase Conversions
- Why we’ve developed the searchhub smartSuggest module and why it might matter to you
- Nielsen Norman Group: Site Search Suggestions
- 13 Design Patterns for Autocomplete Suggestions
- Autocomplete
- Autocomplete and User Experience
- IMPLEMENTING A LINKEDIN LIKE SEARCH AS YOU TYPE WITH ELASTICSEARCH
- Smart autocomplete best practices: improve search relevance and sales
-
Knowledge graphs applied in the retail industry
Knowledge graphs are becoming increasingly popular in tech. We explore how they can be used in the retail industry to enrich data, widen search results and add value to a retail company.
- Daniel Tunkelang Query Understanding.
- Query Understanding, Divided into Three Parts
- Search for Things not for Strings
- Understanding the Search Query. Part 1, Part 2, Part 3
- Food Discovery with Uber Eats: Building a Query Understanding Engine
- Paper Unsupervised Query Segmentation Using only Query Logs
- Paper Towards Semantic Query Segmentation
- Practical BM25: How Shards Affect Relevance Scoring in Elasticsearch, The BM25 Algorithm and its Variables
- The influence of TF-IDF algorithms in eCommerce search
- BM25 The Next Generation of Lucene Relevance
- How is search different than other machine learning problems?
- Reinforcement learning assisted search ranking
- E-commerce Search Re-Ranking as a Reinforcement Learning Problem
- When to use a machine learned vs. score-based search ranker
- What is Learning To Rank?
- Using AI and Machine Learning to Overcome Position Bias within Adobe Stock Search
- Understanding BERT and Search Relevance
- Google is improving web search with BERT – can we use it for enterprise search too?
- Automatically detect common phrases – multi-word expressions / word n-grams – from a stream of sentences.
- The Unreasonable Effectiveness of Collocations
- Locality Sensitive Hashing
- Minhash
- Better than Average: Sort by Best Rating
- How Not To Sort By Average Rating
- One hot encoding
- Keyword Extraction using RAKE
- Yet Another Keyword Extractor (Yake)
- Writing a full-text search engine using Bloom filters
- Anonymisation: managing data protection risk (code of practice)
- The Anonymisation Decision-Making Framework
- 98 personal data points that book uses to target ads to you
- Opportunity Analysis for Search
- A Face Is Exposed for AOL Searcher No. 4417749
- AOL search data leak
- Personal data
- Discounted cumulative gain
- Mean reciprocal rank
- Demystifying nDCG and ERR
- Choosing your search relevance evaluation metric
- How to Implement a Normalized Discounted Cumulative Gain (NDCG) Ranking Quality Scorer in Quepid
- https://en.wikipedia.org/wiki/Precision_and_recall
- https://en.wikipedia.org/wiki/F1_score
- 5 Right Ways to Measure How Search Is Performing
- E-commerce Site-Search KPIs. Part 1 – Customers, Part 2 – Products, Part 3 - Queries
- A/B Testing for Search is Different
- Learning from Friction to Improve the Search Experience
- Behind the Wizardry of a Seamless Search Experience
- Analyzing online search relevance metrics with the Elastic Stack
- How to Gain Insight From Search Analytics
- Statistical and human-centered approaches to search engine improvement
- A Human Approach
- Setting up a relevance evaluation program
- Metrics Matter
- A/B Testing Search: thinking like a scientist
- Query Triage: The Secret Weapon for Search Relevance
- The Launch Review: bringing it all together…
- AI-powered search
- Relevant Search
- Deep Learning for search
- Interactions with search systems
- Embeddings in Natural Language Processing. Theory and Advances in Vector Representation of Meaning
- Search User Interfaces
- Search Patterns
- Search Analytics for Your Site: Conversations with Your Customers
- Search is a Team Sport
- Thoughts about Managing Search Teams
- Building an Effective Search Team: the key to great search & relevancy
- Query Triage: The Secret Weapon for Search Relevance
- The Launch Review: bringing it all together
- The Role of Search Product Owners
- Interview Questions for Search Relevance Engineers, Data Scientists, and Product Managers
- Data Science Interviews: Ranking and search
- How do I know that my search is broken?
- What does it mean if my search is ‘broken’?
- How do you fix a broken search?
- Reducing business risk by optimizing search
Better search through query understanding.
- An Introduction
- Language Identification
- Character Filtering
- Tokenization
- Spelling Correction
- Stemming and Lemmatization
- Query Rewriting: An Overview
- Query Expansion
- Query Relaxation
- Query Segmentation
- Query Scoping
- Entity Recognition
- Taxonomies and Ontologies
- Autocomplete
- Autocomplete and User Experience
- Contextual Query Understanding: An Overview
- Session Context
- Location as Context
- Seasonality
- Personalization
- Search as a Conversation
- Clarification Dialogues
- Relevance Feedback
- Faceted Search
- Search Results Presentation
- Search Result Snippets
- Search Results Clustering
- Question Answering
- Query Understanding and Voice Interfaces
- Query Understanding and Chatbots
- Not your father’s search engine: a brief history of retail search
- Semantic vector search: the new frontier in product discovery
- Boosting product discovery with semantic search
- Semantic query parsing blueprint
- Bing
- Yandex
- Amazon
- eBay
- Algolia
- Vespa
- Elastic
- Solr
- Fess Enterprise Search Server
- OpenSource Connections
- SearchHub.io
- https://sease.io/
- Airbnb - Machine Learning-Powered Search Ranking of Airbnb Experiences
- Airbnb - Listing Embeddings in Search Ranking
- Skyscanner - Learning to Rank for Flight Itinerary Search
- Search at Slack
- How Google Search Ranking Works – Darwinism in Search
- How Bing Ranks Search Results: Core Algorithm & Blue Links
- Discover How Cassini (The eBay Search Engine) Works and Rank
- Amazon SEO Explained: How to Rank Your Products #1 in Amazon Search Results in 2020
- Building a Better Search Engine for Semantic Scholar
Awesome Spacy - Natural language upderstanding, content enrichment etc.
- Word2Vec For Phrases — Learning Embeddings For More Than One Word
- Gensim Word2Vec Tutorial
- How to incorporate phrases into Word2Vec – a text mining approach
- Word2Vec — a baby step in Deep Learning but a giant leap towards Natural Language Processing
- How to Develop Word Embeddings in Python with Gensim