Segfault during recognizing text
Closed this issue · 3 comments
Hi @straussmaximilian, thanks a lot for writing this package! I'm using ocrmac
to build a small tool to recognize video subtitles. While I'm able to use thetext_from_image
function for many images successfully, there are (random) moments that it fails. Here's a script that I'm running together with a collection of images frames.zip:
import PIL.Image
from pathlib import Path
from ocrmac import ocrmac
for loc in sorted(Path("frames/").glob("*.png")):
print(loc.stem)
image = PIL.Image.open(loc)
annotations = ocrmac.text_from_image(
image,
recognition_level="accurate",
language_preference=["zh-Hans"],
)
print("Success!")
This script completes successfully for 5 out of 10 runs. The other five times, it segfaults, each time for a different image.
I tried to look into this a bit more with the help of GPT, which told me to attach lldb
to the Python process. I did that and when the segfault occurs I get the following result:
Process 1115 resuming
Process 1115 stopped
* thread #11, queue = 'com.apple.VNRecognizeTextRequestRevision3', stop reason = EXC_BAD_ACCESS (code=1, address=0xcc020df5720)
frame #0: 0x0000000186507ff0 libobjc.A.dylib`objc_release + 16
libobjc.A.dylib`objc_release:
-> 0x186507ff0 <+16>: ldr x17, [x2, #0x20]
0x186507ff4 <+20>: tbz w17, #0x2, 0x186508058 ; <+120>
0x186507ff8 <+24>: tbz w16, #0x0, 0x186508074 ; <+148>
0x186507ffc <+28>: lsr x17, x16, #55
I got stuck here as I don't know much about the internals. Do you have any ideas on what's going on?
I tested this and could reproduce the issue. It could be that this is stemming from pyobjc - I just opened an issue there!
Should be fixed now. Feel free to reopen if the issue persists!
Thanks a lot!