deeplearning4j/deeplearning4j

Importing Keras Tokenizer from Json Exception Java

zax-ma opened this issue · 0 comments

zax-ma commented

Hello,

Can't import tokenizer created in python and saved to JSON as recomended in method KerasTokenizer.fromJson

Please help me to resolve the issue. What could be done wrong?

Thank you,

Issue Description

Please describe our issue, along with:

  • expected behavior:
    Tokenizer, created by

tokenizer_json = tokenizer.to_json()
with io.open('tokenizer.json', 'w', encoding='utf-8') as f:
f.write(json.dumps(tokenizer_json, ensure_ascii=False))

importing to Java project by the command:

KerasTokenizer.fromJson(TOKEN_PATH);

as a result an object of tokenizer should be created.

  • encountered behavior:

Exception appeared

Exception in thread "main" org.nd4j.shade.jackson.databind.exc.MismatchedInputException: Cannot construct instance of java.util.HashMap (although at least one Creator exists): no String-argument constructor/factory method to deserialize from String value ('{"class_name": "Tokenizer", "config": {"num_words": 10, "filters": "!"#$%&()*+,-./:;<=>?@[]^_{|}~tn", "lower": true, "split": " ", "char_level": false, "oov_token": null, "document_count": 23, "word_counts": "{"url1": 12, "url2": 12, "url3": 10, "url4": 16, "url5": 15, "url6": 16}", "word_docs": "{"url1": 12, "url2": 10, "url3": 10, "url4": 16, "url5": 14, "url6": 13}", "index_docs": "{"4": 12, "5": 10, "6": 10, "1": 16, "3": 14, "2": 13}", "index_word": "{"1": "url4", "2": "url6", "3": "url5", "4": "url1", "5": "url2", "6": "url3"}", "word_index": "{"url4": 1, "url6": 2, "url5": 3, "url1": 4, "url2": 5, "url3": 6}"}}') at [Source: (String)""{\"class_name\": \"Tokenizer\", \"config\": {\"num_words\": 10, \"filters\": \"!\"#$%&()*+,-.\/:;<=>?@[]^_{|}~tn", "lower": true, "split": " ", "char_level": false, "oov_token": null, "document_count": 23, "word_counts": "{"url1": 12, "url2": 12, "url3": 10, "url4": 16, "url5": 15, "url6": 16}", "word_docs": "{"url1": 12, "url2": 10, "url3": 10, "url4": 16, "url5": 14, "url6": 13}", "index_docs": "{"4": 12, "5": 10, "6": 10, "1": 16, "3"[truncated 246 chars]; line: 1, column: 1]
at org.nd4j.shade.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:63)
at org.nd4j.shade.jackson.databind.DeserializationContext.reportInputMismatch(DeserializationContext.java:1728)
at org.nd4j.shade.jackson.databind.DeserializationContext.handleMissingInstantiator(DeserializationContext.java:1353)
at org.nd4j.shade.jackson.databind.deser.std.StdDeserializer._deserializeFromString(StdDeserializer.java:311)
at org.nd4j.shade.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:444)
at org.nd4j.shade.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:32)
at org.nd4j.shade.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)
at org.nd4j.shade.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4674)
at org.nd4j.shade.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3629)
at org.nd4j.shade.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3612)
at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelUtils.parseJsonString(KerasModelUtils.java:400)
at org.deeplearning4j.nn.modelimport.keras.preprocessing.text.KerasTokenizer.fromJson(KerasTokenizer.java:107)
at com.example.importh5.ImportKerasModel.main(ImportKerasModel.java:40)

Version Information

Please indicate relevant versions, including, if relevant:

  • Deeplearning4j version
  • Platform information (OS, etc)
  • CUDA version, if used
  • NVIDIA driver version, if in use

Additional Information

Where applicable, please also provide:

version 1.0.0-M2.1

  • Full log or exception stack trace (ideally in a Gist: gist.github.com)
  • pom.xml file or similar (also in a Gist)

Contributing

If you'd like to help us fix the issue by contributing some code, but would
like guidance or help in doing so, please mention it!