Additional features of this fork:
- Bumped libs
- PoC for parallelization. Spoiler: Does not speed up things
- HTTP upload server as separate entry point
- Assuming the ocr contains medical content: upload
DocumentReference
to FHIR R4 server: http://hapi.fhir.org/baseR4
Tmp removed feature:
- Date extraction via Natty parser
The two main classes may be started directly from the IntelliJ IDE. GraalVM was used to run it locally on Mac.
Search uploaded content on public FHIR server, eg http://hapi.fhir.org/baseR4/DocumentReference/1845301/_history/1
OCR with Akka, Tesseract, and JavaCV | Part 1 OCR with Akka, Tesseract, and JavaCV | Part 2
brew install tesseract
And then I set an Environment Var LC_ALL=C
in IntelliJ Run Configuration
Didn't need this env var in VSCode
sbt run
Open http://localhost:8080 and upload an image.
Or, using cURL:
curl -X POST -F 'fileUpload=@/Users/duanebester/Documents/input.jpg' 'http://localhost:8080/image/ocr'