spinscale/elasticsearch-ingest-langdetect

exception if field empty string but ignore_missing true

greenx opened this issue · 0 comments

I setting ignore_missing true.
If the field does not exist, everything is fine, but if the field contains an empty string, an exception is thrown.

# curl -s -H 'content-type: application/json; charset=UTF-8' "http://localhost:9205/doc-2019-13/searchtype/2?pretty=true&pipeline=langdetect" -d'{"name":"4","body2":{"txt":""}}'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "exception",
        "reason" : "java.lang.IllegalArgumentException: com.cybozu.labs.langdetect.LangDetectException: no features in text",
        "header" : {
          "processor_type" : "langdetect"
        }
      }
    ],
    "type" : "exception",
    "reason" : "java.lang.IllegalArgumentException: com.cybozu.labs.langdetect.LangDetectException: no features in text",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "com.cybozu.labs.langdetect.LangDetectException: no features in text",
      "caused_by" : {
        "type" : "lang_detect_exception",
        "reason" : "no features in text"
      }
    },
    "header" : {
      "processor_type" : "langdetect"
    }
  },
  "status" : 500
}

However, adding conditions to the processor helps:

      "langdetect": {
        "if": "ctx.body2.txt != \"\"",                                                                                                                                                       
        "field":"body2.txt",
        "target_field": "body2.lang",
        "ignore_missing": true
      }  

But it seems to be logical "ignore missing" so it should let everything through.