otiai10/gosseract

Fails to OCR file that CLI Tesseract handles perfectly

noahsilverman opened this issue · 2 comments

This text is generated based on ISSUE_TEMPLATE.md. The issue reporter must read and remove this block before submitting.

Summary

  • I'm attempting to OCR some digits from a small image. Calling Tesseract from the CLI works perfectly. However, if I use the gosseract library within Go, it returns an empty string.

Reproducibility

tesseract --psm 13 img.jpg -
1234567
package main

import (
	"fmt"
	"github.com/otiai10/gosseract/v2
)

func main() {
	client := gosseract.NewClient()
	defer client.Close()
	err := client.SetLanguage("eng")
	if err != nil {
		fmt.Println(err)
	}

	err = client.SetPageSegMode(13)
	if err != nil {
		fmt.Println(err)
	}

	err = client.SetImage("img.jpg")
	if err != nil {
		fmt.Println(err)
	}

	text, err := client.Text()
	if err != nil {
		fmt.Println(err)
	}

	fmt.Println(text)
}

Environment

uname -a
Darwin MacBook-Pro.local 22.5.0 Darwin Kernel Version 22.5.0
go version
go version go1.20.7 darwin/arm64
tesseract --version
tesseract 5.3.2
 leptonica-1.82.0
  libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.5.1) : libpng 1.6.40 : libtiff 4.5.1 : zlib 1.2.11 : libwebp 1.3.1 : libopenjp2 2.5.0
 Found NEON
 Found libarchive 3.6.2 zlib/1.2.11 liblzma/5.4.1 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.4
 Found libcurl/7.88.1 SecureTransport (LibreSSL/3.3.6) zlib/1.2.11 nghttp2/1.51.0

@noahsilverman Is it possible to share the image file with me?

Faced a similar problem too. Turned out to be image orientation in my case. Rotating the image fixed the issue for me. Is the tesseract CLI able to detect orientation automatically and auto-rotate the image for us?