Azure-Samples/cognitive-services-REST-api-samples

Output searchable PDF or hOCR

Closed this issue · 4 comments

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

microsoft Read API not providing hOCR output or searhable PDF.

Any log messages given by the failure

Expected/desired behavior

Output should be searchable PDF

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

v2.0,v 2.1

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Thanks @prashantguleria , sorry for the delay. Which sample was this for? If you are still needing an OCR sample, we added one recently: https://github.com/Azure-Samples/cognitive-services-REST-api-samples/blob/master/nodejs/Vision/RecognizeText.js
If you need this in another language, let us know, thanks.

And now, can azure output a searchable pdf? (@wiazur your link is dead)

Hi @WiliTest, I'm not with Microsoft anymore, but here's the OCR sample to replace the dead link. I do believe OCR has that ability to print to PDF, but I'd check with the Cognitive Services Azure support team to double check.
https://github.com/Azure-Samples/cognitive-services-REST-api-samples/blob/master/nodejs/Vision/ComputerVisionOCR.js

I would like to popup this issue because none of the samples produce either a searchable PDF or a hOCR result. We still want a solution to get either of them.