postgrespro/rum

Ad hoc search using RUM indexes

rudibroekhuizen opened this issue · 6 comments

I was wondering how to set up ad hoc search using RUM indexes. For example, I have a table with three columns. I would like to search combinations of the fields. When creating a tsvector column, using all columns the content of the fields are combined, so I can not force to search only one field.

timestamp species location
2019-11-04 Buteo buteo Texel
2019-10-03 Haliaeetus albicilla Werkendam
2019-10-01 Athene noctua Zwolle

Should I create tsvector columns for column 'species', and a seperate tsvector column for 'location' and use a query like this:

SELECT *
FROM mytable
WHERE timestamp BETWEEN '2019-11-01T13:00:00.000Z' AND '2019-11-15T13:00:00.000'
AND tsv_species @@ websearch_to_tsquery('simple','Buteo')
AND tsv_location @@ websearch_to_tsquery('simple','Texel or Zwolle')
ORDER BY by observation_timestamp <=| '2019-11-15T13:00:00.000';

Or are there better options? For example, can I use multiple rum_tsvector_addon_ops arguments when creating a RUM index?

I thought weight labels could only be used using a GIN index, can they also be used with a RUM index?

BTW, in spite of GIN weights are present in RUM index posting lists and could be used for filtering within the index.

Thanks. I did some testing with weights, but it doesn't seem to work with websearch_to_tsquery, only in combination with to_tsquery. Another thing is that the set_weight setting only works when using at most 4 columns (A,B,C,D), so if I want to search more than four, it is not possible. Maybe create tsv columns per column? Concat them all to search in any field?