pdf2text

There are 26 repositories under pdf2text topic.

  • modesty/pdf2json

    converts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.

    Language:Java2.1k50263384
  • seinecle/nocodefunctions-web-app

    The code base of the front-end of nocodefunctions.com

    Language:Java40337
  • yakovypg/Ypdf

    We present Ypdf, a PDF document processing application that combines the best features of existing solutions and provides the most popular and requested functionality for free to its users.

    Language:C#23305
  • CheatoMate

    TheLime1/CheatoMate

    A collection of scripts to "help" you with your programming exams and assignments.

    Language:Python17121
  • chiraag-kakar/PyAutomation

    Simple and Useful Automation Tools built with the help of modules available with Python published at PyPI.

    Language:Python11001
  • andrealenzi11/py-poppleract

    Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents

    Language:Python10102
  • worldbank/wb-nlp-tools

    Natural language processing tools developed by the World Bank's DECAT unit. A suite of text preprocessing and cleaning algorithms for NLP analysis and modeling.

    Language:Python10907
  • StephanyBatista/ExtractOcrApi

    A API in .Net Core to extract documents OCR with many libs linux

    Language:C#7103
  • AzozzALFiras/Pdf-OCR

    A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.

    Language:HTML6101
  • TanishqChamoli/Newspaper_Mining

    Newspaper mining and the analysis of the results using python. Cleaning the text using OCR.

    Language:Python5200
  • imesut/PdfReg

    PdfReg is a web tool, which gets text at selected regions of pdf document.

    Language:JavaScript4101
  • DrMcCoy/pdftextorizer

    Interactively extract text from multi-column PDFs

    Language:Python210
  • views63/pdf2text

    pdf to text

    Language:Rust2100
  • FastPDFTeam/pdf-to-word-converter

    Fast PDF to Word Converter is the Fastest Batch PDF Converter easily converting PDF to fully editable Office Word,Text,RTF,HTML and more

  • pdf2text

    Isaccseven/pdf2text

    Extract text from pdf using ocr

    Language:Python1100
  • johbar/go-poppler

    Limited, yet memory-leak-free Go wrapper for a Poppler PDF library

    Language:Go1001
  • sahil352005/ChatWithPdf-Images

    A Streamlit-based app that allows users to upload PDFs or images, extract text, and engage in interactive Q&A. Using Google Generative AI, this app enables insightful conversations based on document contents. Ideal for those seeking quick answers from their files in a simple, intuitive interface.

    Language:Python110
  • seinecle/nocodefunctions-io

    io for nocodefunctions: csv, txt, pdf, and xlsx so far

    Language:Java110
  • senavs/pdfto

    :heavy_check_mark: A Python Flask API to manage PDF files.

    Language:Python110
  • BinhQuocLy/Pdf2Quiz

    A Pdf2Quiz NLP model.

    Language:Python0000
  • ChrisCraddock/DC-Advanced-Walkthrough

    Data Center Advanced Walkthrough. Insert data from a PDF file into MySQL database

    Language:Python0100
  • SeeligA/OCRstream

    Building an OCR pipeline for PDF to TXT

    Language:Python0100
  • 1994nikunj/textify-pdf

    Textify-PDF: Extracting Text from PDF Files

    Language:Python10
  • davibusanello/pdf2txt

    A simple CLI to to convert PDF files into TXT using OCR

    Language:Python
  • fer-aguirre/pdf-2-ner

    Web application for information extraction and named entity recognition for PDF files (work-in-progress).

    Language:Jupyter Notebook10
  • zhangshi0512/DevTools

    A lightweight Python-based Software Package for daily use

    Language:Python10