duydo/elasticsearch-analysis-vietnamese

Support for ES 8.3?

mofolo opened this issue · 8 comments

Hi thank you for your plugin
Is there any support for Elastic Search 8.3?

duydo commented

@mofolo Thanks for being interested in the plugin.
I'm working on upgrading it to Elasticsearch 8.x.

Great work, thank you.

I managed to build your changes on v8.3.3 by skipping the tests - unfortunately the plugin didn't seem to work.

duydo commented

I managed to build your changes on v8.3.3 by skipping the tests - unfortunately the plugin didn't seem to work.

This branch contains the code for v8.3.3, you can try.

Hey @duydo - Question, does the plugin need to have the tokenizer on the same machine for it to work or is the tokenizer wrapped into the JAR plugin itself?
Is there anyway to use this plugin on Elastic that's been created using Elastic Cloud?

duydo commented

@mofolo

Hey @duydo - Question, does the plugin need to have the tokenizer on the same machine for it to work or is the tokenizer wrapped into the JAR plugin itself?

Yes, you need to have the C++ tokenizer on the same machine running Elasticsearch.

Is there anyway to use this plugin on Elastic that's been created using Elastic Cloud?

It seems that Elastic Cloud does not support install custom plugins using native lib via JNI.

Yes you're right, the Cloud version doesn't allow us to load custom binaries onto the machine running Elastic either.

Would it be viable (although not as effective) to fork a version of the repo that utilises the vnTokenizer
(https://github.com/duydo/vn-nlp-libraries) and package everything into the JAR?

duydo commented

@mofolo

Would it be viable (although not as effective) to fork a version of the repo that utilises the vnTokenizer
(https://github.com/duydo/vn-nlp-libraries) and package everything into the JAR?

Yes, vnTokenizer was implemented with pure Java so you can package it into the JAR file. You can refer this branch https://github.com/duydo/elasticsearch-analysis-vietnamese/tree/vntokenizer