/understand-full-text-search

📖 Support examples for learning full-text search with use of PostgreSQL. Ready to run.

Understand full-text search with Postgres

forthebadge

Piotr Lewandowski, @constjs


Table of content

  1. Create demo DB tables
  2. Stemmer — Building documents
  3. Search — Building queries
  4. Performance practises
  5. Setting weight and ranking
  6. Improve search quality
  7. Further reading

When full-text search?

  • Search by content created by people (not programmers)
  • Divide more and less important fragments of document
  • Searching database dumps from WikiLeaks

Why just not RegEx?

  • RegEx is good to find only simple, finite languages

  • Helpless for grammar

  • Slow (Can be improved with Trigram Indexes)

  • Lots of pitfalls even for simple languages like HTML

  • Complicated to maintain, e.g.

    ^(?=[A-Z0-9][A-Z0-9@._%+-]{5,253}$)[A-Z0-9._%+-]{1,64}@(?:(?=[A-Z0-9-]{1,63}\.)[A-Z0-9]+(?:-[A-Z0-9]+)*\.){1,8}[A-Z]{2,63}$
    

Why Postgres?

  • Pretty rich in features
  • Flexible and extensible
  • Maybe you already have it
    • Low entry point
    • If your technology stack is already over-engineered