Windows, Error opening data file C:\Program Files\Tesseract-OCR\tessdata/cse.traineddata
KirandebDivad opened this issue · 2 comments
Current Behavior
On Windows not working another language on version from tesseract-ocr-setup-3.05.02-20180621 to tesseract-ocr-w64-setup-5.4.0.20240606 with error
Error opening data file C:\Program Files\Tesseract-OCR/tessdata/cse.traineddata
or with TESSDATA_PREFIX=C:\Program Files\Tesseract-OCR\tessdata
C:\Program Files\Tesseract-OCR\tessdata/cse.traineddata
The '/ ' backslash is wrong, and on Windows this path will not work.
Expected Behavior
No response
Suggested Fix
Correct append path
from
/any_language.traineddata
to
\any_language.traineddata
tesseract -v
from
tesseract-ocr-setup-3.05.02-20180621
to
tesseract-ocr-w64-setup-5.4.0.20240606
Operating System
Windows 11
Other Operating System
No response
uname -a
No response
Compiler
https://digi.bib.uni-mannheim.de/tesseract/
CPU
No response
Virtualization / Containers
No response
Other Information
No response
The '/ ' backslash is wrong, and on Windows this path will not work.
Learn the system you try to use:
- '/' is forward slash, and '' is backslash
- '/' works without problem for years:
> dir "C:\Program Files\Tesseract-OCR/tessdata"
Volume in drive C is OS
Volume Serial Number is 8AA5-2E4A
Directory of C:\Program Files\Tesseract-OCR\tessdata
26.11.2023 15:06 <DIR> .
26.11.2023 15:06 <DIR> ..
26.11.2023 15:06 <DIR> configs
05.10.2023 21:11 4 113 088 eng.traineddata
16.01.2019 22:53 33 eng.user-patterns
16.01.2019 22:53 27 eng.user-words
05.10.2023 21:14 128 076 jaxb-api-2.3.1.jar
05.10.2023 21:11 10 562 727 osd.traineddata
05.10.2023 21:36 572 pdf.ttf
05.10.2023 21:14 125 187 piccolo2d-core-3.0.1.jar
05.10.2023 21:14 149 558 piccolo2d-extras-3.0.1.jar
26.11.2023 15:06 <DIR> script
05.10.2023 21:14 26 376 ScrollView.jar
26.11.2023 15:06 <DIR> tessconfigs
9 File(s) 15 105 644 bytes
5 Dir(s) 16 740 601 856 bytes free
Learn the program you try to "support":
The error:
Error opening data file C:\Program Files\Tesseract-OCR/tessdata/cse.traineddata
Is genereted by Tesseract doo the CSE.traineddata not exit
Correct language syntax is CES and file CES.traineddata
Be nicer and repair the error message for this to:
Error language data file not exist C:\Program Files\Tesseract-OCR/tessdata/cse.traineddata
@KirandebDivad, Microsoft supports '/' as path separator since its first versions of MSDOS and still does so in its latest Windows versions.
Users should normally not set TESSDATA_PREFIX
. If your cse.traineddata exists at the given path, it might be broken. A very common user error is using a wrong download URL which results in an HTML file instead of a Tesseract model.
And please note that there exists no support team for Tesseract. You cannot expect that a few volunteers support the whole world. Therefore questions (and your issue is a question!) should be asked in the user forum.