This is an open-source macOS-based Objective-C wrapper for the OCR library Tesseract.
As far as I know, mine is the only working version ported to macOS.
Star this repo if you find it helpful, fork this repo if you want to experiment with it. :)
For those of you who wish to first test out the OCR capabilities, I have created a demo application that lets you do just this.
First build the Xcode project included in this repository. This will generate an application through wish you can take a screenshot, as shown in the following gif.
In the Xcode log you will find the corresponding text Tesseract detected for this screenshot.
-
Clone this project
-
Copy over the
include
,lib
, andtessdata
folders to your project. -
Add these folders to your project in Xcode. Make sure
include
andlib
are added as groups andtessdata
is added as a folder reference.The location of this setting is shown in the following image:
-
Copy over the files
SLTesseract.mm
andSLTesseract.h
to your code directory. -
Verify that the file
SLTesseract.mm
is added toTargets > Build Phases > Compile Sources
. Additionally, verify that all the static libraries are also added toTargets > Build Phases > Link Binary With Libraries
. (This process should be done automatically) -
You are now ready to use Tesseract in your macOS project. (See Example Usage for code syntax)
None so far.
At the top of the file include the header file
#import "SLTesseract.h"
And then
SLTesseract *ocr = [[SLTesseract alloc] init];
will initiallize the class SLTesseract.
(optional) ocr.language = @"eng";
(optional) ocr.charWhitelist = @"abcdefghijklmnopqrstuvwxyz"
(optional) ocr.charBlacklist = @"1234567890"
Finally, assuming you already have the image that you wish to perform OCR on in NSImage form, you can call
NSString *text = [ocr recognize:image];
to recognize the image in question and get the corresponding text.
- Tesseract (v3.05.01)
- Leptonica (v1.75.3)
- LibPNG (v1.6.34)
- LibTIFF (v4.0.9)
- LibJPEG (v9c)
- LibZ (v1.2.11)
My project Tesseract macOS itself is distributed under the MIT license (see LICENSE);
Keep in mind that the main dependency Tesseract is distributed under the Apache 2.0 license.
You may reach me at Tesseract-macOS@scott-liu.com
to inquire about this project.