Fails to OCR file that CLI Tesseract handles perfectly
noahsilverman opened this issue · 2 comments
noahsilverman commented
This text is generated based on
ISSUE_TEMPLATE.md
. The issue reporter must read and remove this block before submitting.
Summary
- I'm attempting to OCR some digits from a small image. Calling Tesseract from the CLI works perfectly. However, if I use the gosseract library within Go, it returns an empty string.
Reproducibility
tesseract --psm 13 img.jpg -
1234567
package main
import (
"fmt"
"github.com/otiai10/gosseract/v2
)
func main() {
client := gosseract.NewClient()
defer client.Close()
err := client.SetLanguage("eng")
if err != nil {
fmt.Println(err)
}
err = client.SetPageSegMode(13)
if err != nil {
fmt.Println(err)
}
err = client.SetImage("img.jpg")
if err != nil {
fmt.Println(err)
}
text, err := client.Text()
if err != nil {
fmt.Println(err)
}
fmt.Println(text)
}
Environment
uname -a
Darwin MacBook-Pro.local 22.5.0 Darwin Kernel Version 22.5.0
go version
go version go1.20.7 darwin/arm64
tesseract --version
tesseract 5.3.2
leptonica-1.82.0
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.5.1) : libpng 1.6.40 : libtiff 4.5.1 : zlib 1.2.11 : libwebp 1.3.1 : libopenjp2 2.5.0
Found NEON
Found libarchive 3.6.2 zlib/1.2.11 liblzma/5.4.1 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.4
Found libcurl/7.88.1 SecureTransport (LibreSSL/3.3.6) zlib/1.2.11 nghttp2/1.51.0
otiai10 commented
@noahsilverman Is it possible to share the image file with me?
rubiojr commented
Faced a similar problem too. Turned out to be image orientation in my case. Rotating the image fixed the issue for me. Is the tesseract CLI able to detect orientation automatically and auto-rotate the image for us?