meilisearch/charabia

remove unnecessary iteration in khmer segmenter

xshadowlegendx opened this issue · 2 comments

hello, there is little improvement here in this function of khmer segmenter where it can be looping once and use the iterator instead of collecting it before hand, I can create another pr to update it, what do you think?

Hello @xshadowlegendx,
yes, it's a good idea. You can create a PR enhancing the segmenter, and I'll review it.
I know you used the icu segmenter to segment the text, but did you know that an internal dictionary-based segmenter exists and is fast, if you want to try it there is an example of usage in the Thai segmenter.

see you!

hello @ManyTheFish, thank you and I will prepare the PR and also take a look at the internal dictionary-based segmenter