[new resource]: OCR with Google Vision API and Tesseract
Closed this issue · 2 comments
charlottejmc commented
Title of the resource
OCR with Google Vision API and Tesseract
Resource type
External Resource
Authors, editors and contributors
Isabelle Gribomont, Liz Fischer, Ryan Cordell, Clemens Neudecker
Topics (keywords)
DH, Open Education, Open Access, Python
Learning outcomes
After completing this lesson, you will be able to:
- Combine Google Vision’s character recognition with Tesseract’s layout detection to generate high-quality OCR outputs for a wide range of documents
- Accurately convert PDF files into plain text
- Understand a variety of considerations to keep in mind when converting a PDF to plain text
Abstract
Google Vision and Tesseract are both popular and powerful OCR tools, but they each have their weaknesses. In this lesson, you will learn how to combine the two to make the most of their individual strengths and achieve even more accurate OCR results.
VickyGarnett commented
Hi Charlotte - these people are all in the system, so you can continue to draft the resource :)
VickyGarnett commented
resource published, closing issue