Better tokenizer
Opened this issue · 0 comments
ptbrowne commented
fromOldClient
does not return any result in docs.cozy.io search.
CozyClient.fromOldClient
does return the result.
IMHO this is caused by the tokenizer, that considers the point not to be separating two words which causes "fromOldClient" not to be a word.
I think we tweaked the tokenizer because we needed doctypes to be returned as is, that is : the dot should not split doctypes like "io.cozy.bills" but should split CozyClient.fromOldClient.
Since doctypes can be inferred as having at least 2 dots, and no starting capital, we could maybe improve the tokenizer to support both cases.
See
Line 221 in ffda0e0