Losing stderr when running multiple clients at once
uniwisejohannes opened this issue · 1 comments
Sorry if this is silly, but we wanted to hear the developer's thoughts on this.
In our project we initialise 4 different gosseract.Client
s one at a time using gosseract.NewClient
, and then for each of these clients we have a Go routine in which we call SetImageFromBytes
and GetBoundingBoxes
on a client. Each of these threads are consuming a lot of documents that they are processing one at a time (so 4 at a time sometimes).
This seemingly corrupts stderr
so that we lose all the logs in our system.
Is it just not possible to run 4 seperate TessBaseAPI
s at once?
We are using ENV OMP_THREAD_LIMIT=1
in our Dockerfile and our pod has 4 cores.
We also have RUN CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build -o /build/bin/service main.go
Our image base is debian:12
That's because gosseract
currently hijacks stderr
.
This was a workaround when we implemented gosseract
, and we don't believe this is the best way.
We need to identify the best way. Meanwhile, I'm thinking about opt-out the stderr hijack.
What do you think?