tesseract-ocr/tesseract

Windows, Error opening data file C:\Program Files\Tesseract-OCR\tessdata/cse.traineddata

KirandebDivad opened this issue · 2 comments

Current Behavior

On Windows not working another language on version from tesseract-ocr-setup-3.05.02-20180621 to tesseract-ocr-w64-setup-5.4.0.20240606 with error
Error opening data file C:\Program Files\Tesseract-OCR/tessdata/cse.traineddata
or with TESSDATA_PREFIX=C:\Program Files\Tesseract-OCR\tessdata
C:\Program Files\Tesseract-OCR\tessdata/cse.traineddata

The '/ ' backslash is wrong, and on Windows this path will not work.

Expected Behavior

No response

Suggested Fix

Correct append path
from
/any_language.traineddata
to
\any_language.traineddata

tesseract -v

from
tesseract-ocr-setup-3.05.02-20180621
to
tesseract-ocr-w64-setup-5.4.0.20240606

Operating System

Windows 11

Other Operating System

No response

uname -a

No response

Compiler

https://digi.bib.uni-mannheim.de/tesseract/

CPU

No response

Virtualization / Containers

No response

Other Information

No response

The '/ ' backslash is wrong, and on Windows this path will not work.

Learn the system you try to use:

  1. '/' is forward slash, and '' is backslash
  2. '/' works without problem for years:
> dir "C:\Program Files\Tesseract-OCR/tessdata"
 Volume in drive C is OS
 Volume Serial Number is 8AA5-2E4A

 Directory of C:\Program Files\Tesseract-OCR\tessdata

26.11.2023  15:06    <DIR>          .
26.11.2023  15:06    <DIR>          ..
26.11.2023  15:06    <DIR>          configs
05.10.2023  21:11         4 113 088 eng.traineddata
16.01.2019  22:53                33 eng.user-patterns
16.01.2019  22:53                27 eng.user-words
05.10.2023  21:14           128 076 jaxb-api-2.3.1.jar
05.10.2023  21:11        10 562 727 osd.traineddata
05.10.2023  21:36               572 pdf.ttf
05.10.2023  21:14           125 187 piccolo2d-core-3.0.1.jar
05.10.2023  21:14           149 558 piccolo2d-extras-3.0.1.jar
26.11.2023  15:06    <DIR>          script
05.10.2023  21:14            26 376 ScrollView.jar
26.11.2023  15:06    <DIR>          tessconfigs
               9 File(s)     15 105 644 bytes
               5 Dir(s)  16 740 601 856 bytes free

Learn the program you try to "support":

The error:
Error opening data file C:\Program Files\Tesseract-OCR/tessdata/cse.traineddata
Is genereted by Tesseract doo the CSE.traineddata not exit

Correct language syntax is CES and file CES.traineddata

Be nicer and repair the error message for this to:
Error language data file not exist C:\Program Files\Tesseract-OCR/tessdata/cse.traineddata

@KirandebDivad, Microsoft supports '/' as path separator since its first versions of MSDOS and still does so in its latest Windows versions.

Users should normally not set TESSDATA_PREFIX. If your cse.traineddata exists at the given path, it might be broken. A very common user error is using a wrong download URL which results in an HTML file instead of a Tesseract model.

And please note that there exists no support team for Tesseract. You cannot expect that a few volunteers support the whole world. Therefore questions (and your issue is a question!) should be asked in the user forum.