Full text search: unexpected results when text search vector includes a forward slash
Actionb opened this issue · 0 comments
Actionb commented
Example:
Ausgabe object with the name 2020/21-06
cannot be found when searching for 21
or 06
.
Cause:
Postgres recognizes the vector string as a file and uses the file
parser, which does not split the string/token as expected.
SELECT * FROM ts_debug('simple_unaccent'::regconfig, '2020/21-06');
alias | description | token | dictionaries | dictionary | lexemes
-------+-------------------+------------+--------------+------------+--------------
file | File or path name | 2020/21-06 | {simple} | simple | {2020/21-06}
SELECT to_tsvector('2020/21-06');
to_tsvector
----------------
'2020/21-06':1
Possible fixes:
- disable or change the parser. Seems difficult: https://dba.stackexchange.com/questions/271419/how-postgresql-full-text-search-disable-file-parser and https://stackoverflow.com/questions/896752/postgresql-how-to-make-full-text-search-ignore-certain-tokens
- replace the slashes before inserting the vector into the index. Requires overwriting the SQL form the tsvector_field package.
- use a different delimiter, e.g. backslash:
2020\21-06
:SELECT to_tsvector('2020\21-06'); '-06':3 '2020':1 '21':2