FreeLanguageTools/vocabsieve

Import from KOReader Vocab Builder cannot find the files it needs

Closed this issue · 3 comments

billyc commented

Describe the bug

Import from KOReader Vocab Builder fails upon selecting the folder with all of my KOReader files. The error dialog appears: "Check if you've picked the right directory. It should be a folder containing both all your books and KOReader settings."

The folder I select is the folder containing all of the KOReader data files synced from my Android KOReader installation. This includes a "books" folder, which includes .epub files and a subfolder for each book with the appropriate .sdr folder within it. The folder also contains the KOReader settings folder which contains the vocabulary_builder.sqlite3 database.

To Reproduce

Steps to reproduce the behavior:

  1. Copy or sync all files from KOReader device to a desktop folder, in this case macOS
  2. Click on Import > KOReader Vocab Builder
  3. Select the folder containing the synced KOReader files
  4. See error: the error dialog attached below appears.

Expected behavior

The import should work since all books, `.sdr folders, and the vocab builder sqlite3 file are all present in the synced folder

Screenshots

image

Desktop (please complete the following information):

  • OS: macOS 12.6.4 "Monterey"
  • Vocabsieve version v0.10.1, installed from GitHub Releases page .dmg file

Additional context

KOReader has a "cloud sync" option which syncs just the sqlite3 file with the vocab builder data, but apparently that is not sufficient. The error dialog says the books (and I assume the .sdr folders) must also be present. So, I have copied the entire KOReader folder instead using "SyncThing", and I have verified that all of the files are present in the synced folder on my Mac desktop.

Happy to provide more details or do testing/triage.

Why would there be .srt files for books? Though having extra files shouldn't affect it in any way. The vocab builder uses the metadata that resides in the corresponding .sdr folders. Can you show the folder structure in detail?

billyc commented

@1over137 I meant the .sdr folder, not .srt. My apologies, I've updated the original bug description. Just a typo.

Here is a screenshot of the synced folder. You can see that the entire koreader tree is there, and I've drilled down into koreader/books/Çinar Ağacı.sdr/ so you can see .epub and the .sdr folder contents:

image

and here is the content of the settings folder:

image

Here is a .zip file with the entire contents of the synced koreader folder, in case that helps us debug what's going on...

https://tubcloud.tu-berlin.de/s/RJwXk5T8XXLYWkk

billyc commented

I found the problem - it is related to books with long language codes such as en-US; any book with its language set to the correct main language but with an additional region code fails to get parsed.

In my case I set my target language in VocabSieve to Turkish which is language code "tr", but all of my ePubs have their language code set to "tr-TR" and thus the import fails.

The PR #73 linked above fixes this.