remove unnecessary iteration in khmer segmenter
xshadowlegendx opened this issue · 2 comments
xshadowlegendx commented
hello, there is little improvement here in this function of khmer segmenter where it can be looping once and use the iterator instead of collecting it before hand, I can create another pr to update it, what do you think?
ManyTheFish commented
Hello @xshadowlegendx,
yes, it's a good idea. You can create a PR enhancing the segmenter, and I'll review it.
I know you used the icu
segmenter to segment the text, but did you know that an internal dictionary-based segmenter exists and is fast, if you want to try it there is an example of usage in the Thai segmenter.
see you!
xshadowlegendx commented
hello @ManyTheFish, thank you and I will prepare the PR and also take a look at the internal dictionary-based segmenter