English dictionary stem words are different from the example
thechrisoshow opened this issue · 4 comments
thechrisoshow commented
My problem started when I tried replicating the example in the readme:
class BoringTweet < ActiveRecord::Base
include PgSearch::Model
pg_search_scope :kinda_matching,
against: :text,
using: {
tsearch: {dictionary: "english"}
}
pg_search_scope :literally_matching,
against: :text,
using: {
tsearch: {dictionary: "simple"}
}
end
sleepy = BoringTweet.create! text: "I snoozed my alarm for fourteen hours today. I bet I can beat that tomorrow! #sleepy"
sleeping = BoringTweet.create! text: "You know what I like? Sleeping. That's what. #enjoyment"
sleeper = BoringTweet.create! text: "Have you seen Woody Allen's movie entitled Sleeper? Me neither. #boycott"
BoringTweet.kinda_matching("sleeping") # => [sleepy, sleeping, sleeper]
BoringTweet.literally_matching("sleeping") # => [sleeping]
When I tried this doing a 'kinda_matching' search for 'sleeping' would only return the 'sleeping' record. Looking into it, it looks like the stems for sleepy, sleeping and sleeper are different:
select to_tsvector('sleepy');
=> 'sleepi':1
select to_tsvector('sleeping');
=> 'sleep':1
select to_tsvector('sleeper');
=> 'sleeper':1
Are there different versions of the 'english' catalog perhaps?
I'm running PostgreSQL 14.4 on aarch64-apple-darwin20.6.0, compiled by Apple clang version 12.0.5 (clang-1205.0.22.9), 64-bit
nertzy commented
Interesting! I wrote these examples over a decade ago and haven't thought to keep checking them. I'm going to see what I can find out.
nertzy commented
Indeed, I also get the same results. I'll update the examples!
nertzy commented
Fixed!
thechrisoshow commented
Thanks! Happy it wasn't just me!