UB-Mannheim/tesseract

windows 64bit, 'tesseract.exe has stopped working'

Closed this issue · 5 comments

Environment

  • Tesseract Version: tesseract-ocr-w32-setup-v5.0.0-alpha.20191010.exe
  • Commit Number:
  • Platform: windows 7 64bit, windows2008 64bit

Current Behavior:

Run the following command:

  tesseract c:\t\wz123.jpg c:\t\wz.txt

All physical machines run successfully.
Successfully executed under Windows 10 (VMware), failed to execute in Windows 7 (VMware), windows 2008 (VMware, 2 virtual machines), and the error information is the same:
tesseract.exe has stopped working.
Problem detail:

Problem signature:
  Problem Event Name:	APPCRASH
  Application Name:	tesseract.exe
  Application Version:	0.0.0.0
  Application Timestamp:	5db94754
  Fault Module Name:	libtesseract-5.dll
  Fault Module Version:	0.0.0.0
  Fault Module Timestamp:	5db94750
  Exception Code:	c000001d
  Exception Offset:	001ffa09
  OS Version:	6.1.7600.2.0.0.256.4
  Locale ID:	2052
  Additional Information 1:	0a9e
  Additional Information 2:	0a9e372d3b4ad19135b953a78882e789
  Additional Information 3:	0a9e
  Additional Information 4:	0a9e372d3b4ad19135b953a78882e789

Expected Behavior:

Hope to generate OCR results

Suggested Fix:

Is there a lack of dependency? Or how can I run correctly to get OCR results?
Thank you!

What is the output from tesseract -v on the machines where it is not working?

The output is as follows:

C:\Program Files (x86)\Tesseract-OCR>tesseract -v
tesseract v5.0.0-alpha.20191010
 leptonica-1.78.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.
9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE
 Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5

Please try running Tesseract with parameter -c dotproduct=sse.

Thank you very much for your reply. There will be no error in the running process.
However, OCR results still have some problems. The test process is as follows:


  • The content in wz123.jpg is:
My Name is houji
My Name is houji
My Name is houji
My Name is houji
  • Test 1:
tesseract c:\t\wz123.jpg c:\t\wz.txt -c dotproduct=sse

Results:

§Q$»&«Y«/»«&«&«Y/»M»§L_L&_&X°Qj‘\Q7“(“‘&°7«¥/
Z§7#&_&2?‘7?«0“Y$»/»p§$§/»}«p®2§«»J,¥»2§VA
Q2%&H?&«™2é4_«9Y©&x«2N§«§€»«§»§»§«&§
/«3°«»«»«2»#Swx°“‘V#»(’2%]2X/3¥/&/#=(2#=4»S
  • Test 2(use tessdata_best--eng.traineddata):
tesseract c:\t\wz123.jpg c:\t\wz.txt -c dotproduct=sse

Results:

My Name is houji
My Name is houji
My Name is houji
My Name is houji
  • Test3(use tessdata_fast--chi_sim.traineddata):
tesseract -l chi_sim c:\t\wz123.jpg c:\t\wz.txt -c dotproduct=sse

Results:

M^fUM0^MqM'?kw‰\'_〈q"※″_{〈@′※@_※E※E〈E‰(E〈E=〇〗″'M〇'′「E〗「“
M'〉?「w'K「GQ_'?!V、《〗=〇V「“〗'〗!〇,′〉V〇′″」「※〇w′」zQ″「Gw'
x〗〗〇」「'\「※i“M「dMi」〇」〇'「'《′‖′〉M′。″M〗'2'…^!^
w'p「《〔「〔※「'「]″〗〇〉″〗″w※w'w'?'〗?w〗〇『^y〇″y
  • Test 4(use tessdata_best--chi_sim.traineddata):
tesseract -l chi_sim c:\t\wz123.jpg c:\t\wz.txt -c dotproduct=sse

Results:

My Name is houji
My Name is houji
My Name is houji
My Name is houji

Whether it is possible to further optimize by setting parameters to achieve appropriate results by using tessdata_fast--chi_sim.traineddata, thank you!

I saw the '-c dotproduct=' instruction information from tesseract-ocr#2098 today, the problem has been solved, thank you!