Dictionary load error
Closed this issue · 6 comments
Version 0.5.0:
https://drive.google.com/file/d/14Crq8ywyBdfC1YnDjsv7gZOu70ZH_4cd/view?usp=sharing
https://drive.google.com/file/d/1TYDfxr_j0b3A_h99kmqbul1iFKxiGhfq/view?usp=sharing
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 761 out of bounds for length 385
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 942 out of bounds for length 477
stacktrace is:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 942 out of bounds for length 477
at org.dict.zip.DictZipHeader.getPosition(DictZipHeader.java:398)
at org.dict.zip.DictZipInputStream.seek(DictZipInputStream.java:271)
at org.dict.zip.DictZipInputStream.reset(DictZipInputStream.java:154)
at io.github.eb4j.dsl.impl.EntriesLoaderImpl.skipSpaceTabs(EntriesLoaderImpl.java:270)
at io.github.eb4j.dsl.impl.EntriesLoaderImpl.load(EntriesLoaderImpl.java:106)
at io.github.eb4j.dsl.DslDictionaryLoader.load(DslDictionaryLoader.java:98)
at io.github.eb4j.dsl.DslDictionary.loadDictionary(DslDictionary.java:148)
The command was:
DslDictionary dslDictionary = DslDictionary.loadDictionary(
// new File("c:\github\JsoupExperiments_tmp\test4.dsl"));
// Paths.get("c:\github\JsoupExperiments_tmp\Apresyan\En-Ru_Apresyan.dsl"),
// Paths.get("c:\github\JsoupExperiments_tmp\Apresyan\En-Ru_Apresyan.dsl.idx")
// Paths.get("c:\github\JsoupExperiments_tmp\mueller\Mueller (En-Ru)_new.dsl.dz"),
// Paths.get("c:\github\JsoupExperiments_tmp\mueller\Mueller (En-Ru)_new.dsl.idx")
Paths.get("c:\github\JsoupExperiments_tmp\smirnitsky\Ru-En-Smirnitsky.dsl.dz"),
Paths.get("c:\github\JsoupExperiments_tmp\smirnitsky\Ru-En-Smirnitsky.dsl.idx")
);
Trying to unpack dz, get the following:
Exception in thread "main" java.lang.NullPointerException
at io.github.eb4j.dsl.index.DslIndex$Builder.setDictionaryName(DslIndex.java:2123)
at io.github.eb4j.dsl.DslDictionaryLoader.buildIndexFile(DslDictionaryLoader.java:175)
at io.github.eb4j.dsl.DslDictionaryLoader.load(DslDictionaryLoader.java:102)
at io.github.eb4j.dsl.DslDictionary.loadDictionary(DslDictionary.java:148)
could you place the farmer files in the project's src/test/resources/content
and run test from source ./gradlew test
?
I think I've already tested it in class src/test/java/ip/github/eb4j/dsl4j/DslProprietaryTest
and passed. https://github.com/eb4j/dsl4j/blob/main/src/test/java/io/github/eb4j/dsl/DslProprietaryTest.java#L19-L23
Thank you for report.
-
ArrayIndexOutOfBoundsException
is caused by a diczip v0.12.0 and v0.12.1 bug. It increase position pointer whenreadFully()
method called. The pointer is used whenis.reset()
method, then throw the exception. It will be fixed in next dictzip release. -
NullPointerException
is caused because your data has a format "Big-endian UTF-16 Unicode text, with very long lines, with CRLF line terminators". DSL4j support standard UTF-16 Little-Endian, see README matrix, and does not recognize BE. DSL4j think the data is not UTF-16LE so it try to parse as UTF-8, then failed to get correct index.
NullPointerException
is caused because your data has a format "Big-endian UTF-16 Unicode text, with very long lines, with CRLF line terminators". DSL4j support standard UTF-16 Little-Endian,
IMHO, in shortly said, your files are in wrong format. see README.
I see, thank you. Bt on the other side - I've chosen random file from internet, they are quite old and I think people made them and use them, and they are "almost ok". Anyway - I've integrated them into my reading app, thank you: https://4pda.to/forum/index.php?s=&showtopic=995536&view=findpost&p=113572389
resolved.