postgrespro/pg_trgm_pro

Feature request: option to include trigrams containing whitespace or special characters

Opened this issue · 0 comments

I'd like to be able to include trigrams so that

similarity('foo bar', 'foo bar') > similarity('foo bar', 'bar foo')

I wish I could make it generate the o b and r f trigrams to accomplish this. Similarly I'd like

similarity('foo/bar', 'foo/bar') > similarity('foo/bar', 'foo bar')

via o/b and o b trigrams being generated.

I had implemented my own trigram index in memory for an app, but need to move the fuzzy search to the database.
In my implementation I just split the input at symbol/whitespace boundaries and yielded trigrams individually for each word,
then had a second pass over all symbols and whitespaces that only yielded trigrams that contained a symbol or didn't begin or end with whitespace.