pdftotext
There are 62 repositories under pdftotext topic.
lu4p/cat
Extract text from plaintext, .docx, .odt and .rtf files. Pure go.
zetahernandez/pdf-to-text
Read pdf files on javascript
iron-software/Iron-OCR-Image-to-Text-in-CSharp
Image to Text Tutorial in C# - See https://ironsoftware.com/csharp/ocr/tutorials/how-to-read-text-from-an-image-in-csharp-net/
ashutoshvarma/pyxpdf
Fast and memory-efficient Python PDF Parser based on xpdf sources
shine-jayakumar/Extract-Data-From-PDF-In-Python
Batch-convert pdf to text, extract data from pdf in python
amenezes/aiopytesseract
A Python asyncio wrapper for Tesseract-OCR.
icaropires/pdf2dataset
Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features
Anish-M-code/pdftotext
A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.
yedhink/covid19-kerala-api-deprecated
Deprecated - A fast API service for retrieving day to day stats about Coronavirus(COVID-19, SARS-CoV-2) outbreak in Kerala(India).
raul23/convert-to-txt
Convert documents (pdf, djvu, epub, word) to txt
tecosaur/pdftotext.el
A mirror of https://git.tecosaur.net/tec/pdftotext.el
flyingeek/scriptable-pdfjs
A PDF to text converter for Scriptable App (iOS) working offline
andrealenzi11/py-poppleract
Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents
ExceptedPrism3/PDFToAudio
"PDF To Audio" is a Python tool that transforms PDF documents into audio files using OCR and Text-to-Speech technology. Ideal for accessibility and auditory learning, it supports multiple languages, parallel processing, and smart rate limit handling.
dotcode-moscow/pdf-api
Extract text from a PDF (pdf to text). Api for PHP/JS/Python and others.
ChanMo/docker-poppler
A simple RESTFul API service for poppler
euyogi/Projeto-Anceu-CS50
Meu projeto do curso CS50: Um analisador de pdfs que processa as notas dos aprovados pelo Acesso Enem e organiza tudo. Agora em C++
tahaygun/PDF-to-MongoDB
This project for converting books from PDF to Proper JSON objects by separating title and content. After you take your output, you can insert your JSON file in the database easily.
boettner/pdf2sandwich-pdf
Convert scanned pdf into text embedded pdf.
tmsincomb/ImageToCSV
Converts an image to a CSV. This exists because Chorus 3.0 is bat-shit and only show images for vital metadata.
amitsuthar69/pdf2text
A pdf to text extractor web service written in Go.
DataKind-BLR/covid19bharat_scrapers
All scrapers for covid19
pramodsbaviskar7/PDF2WORD
Computer application built in python to open, edit and convert a document in pdf to microsoft word format. GUI is designed using Tkinter. Opening, conversion and reading of pdf flies is carried out by a python library called PyPDF2
Zeeshanahmad4/NLP-Pdf-Minning-Extracting-text-from-pdf
NLP Pdf Minning Extracting text from pdf
deardurham/ciprs-reader
Python library for reading CIPRS PDFs
DrMcCoy/pdftextorizer
Interactively extract text from multi-column PDFs
farhan0167/BankAIAgent
A tool to convert bank statements into Excel files
views63/pdf2text
pdf to text
ACMCMC/usc-grades-parser
Obtener estadĂsticas de cualificaciones de la USC
CO18347/Newspaper-Mining
Newspaper mining and the analysis of the results using python. Cleaning the text using OCR.
dosadczuk/go-pdftotext
Wrapper for Xpdf command line tool `pdftotext`
kiarashrahmani/PDF-to-text
This Python script utilizes the PyPDF2 library to convert PDF documents into plain text.
leonardyeoxl/PDF-to-Text-Using-OCR-Tesseract
A containerised tool to extract text from PDF file using OCR Tesseract
pradeepbatchu/paddleocr
Image to Text with Flask application
semihucann/the_professor
The Professor (Converter from PDF to Sound)
torviswesley/legoeso-pdf-manager
A simple WordPress PDF document manager.