pdf-document-processor
There are 256 repositories under pdf-document-processor topic.
wmjordan/PDFPatcher
PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等
pdf2htmlEX/pdf2htmlEX
Convert PDF to HTML without losing text or format.
qpdf/qpdf
qpdf: A content-preserving PDF document transformer
run-llama/llama_parse
Parse files for optimal RAG
unidoc/unipdf
Golang PDF library for creating and processing PDF files (pure go)
UglyToad/PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
GowenGit/docnet
DocNET is as fast PDF editing and reading library for modern .NET applications
abarker/pdfCropMargins
pdfCropMargins -- a program to crop the margins of PDF files
sailist/chatgpt-enhancement-extension
An all-in-one plugin to improve your ChatGPT experience!
hellerbarde/stapler
A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk
michaelrsweet/pdfio
PDFio is a simple C library for reading and writing PDF files.
Dtronix/PDFiumCore
.NET Standard P/Invoke bindings for PDFium.
svenssonaxel/pdf-sign
A tool to sign PDF files. With Linux support.
houking-can/CCKS2019-Task5
CCKS2019评测任务五-公众公司公告信息抽取,第3名
uroesch/pdftools
A collection of PDF command line tools and wrappers for Linux
naiveHobo/pdfviewer
PDFViewer is a GUI tool, written using python3 and tkinter, which lets you view PDF documents.
lovasoa/pagelabels-py
Python library to manipulate PDF page labels
GURPREETKAURJETHRA/Multi-PDFs_ChatApp_AI-Agent
Meet MultiPDF 📚 Chat AI App! 🚀 Chat seamlessly with Multiple PDFs using Langchain, Google Gemini Pro & FAISS Vector DB with Seamless Streamlit Deployment. Get instant, accurate responses from Awesome Google Gemini OpenSource language Model. 📚💬 Transform your PDF experience now! 🔥✨
OnedocLabs/onedoc
The first developer-oriented document platform. Generate, host and track PDFs with a single API, beautifully.
sidphbot/Auto-Research
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!
pankajr141/pdf2jpg
Utility to convert PDF into JPG files
KalyanM45/DocGenius-Revolutionizing-PDFs-with-AI
This is a Python application that allows you to load a PDF and ask questions about it using natural language. The application uses a LLM to generate a response about your PDF. The LLM will not answer questions unrelated to the document.
SiddhantSadangi/pdf-workdesk
A Streamlit-powered application that provides a user-friendly interface for editing PDF documents.
papercast-dev/papercast
A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines.
praj2408/Realtime-Document-Chat-System
In this project, we used Langchain to create a ChatGPT for your PDF using Streamlit. We built an application that allows you to ask questions about a PDF document and get answers directly from an LLM (Large Language Model), like OpenAI's ChatGPT.
opendocument-app/pdf2htmlEX-Android
pdf2htmlEX library port for Android - Convert PDF to HTML without losing text or format
taseikyo/backup-utils
:sparkles: A batch of useful code/scripts: run commands automatically, finish repetitive stupid operations, perform format conversions, etc.
StabRise/spark-pdf
PDF DataSource for Apache Spark
BobLd/PdfPigMLNetBlockClassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
datalogics/pdf-rest-api-samples
pdfRest API Toolkit is a REST API service for processing PDF documents, made by developers, for developers. Rapidly integrate PDF workflows with your existing projects and applications, simply and seamlessly. Get started for free in seconds.
sfneal/pdfconduit
Prepare documents for distribution
eiceblue/Spire.PDF-for-Java
Spire.PDF for Java is a PDF component that enables to read, write, print and convert PDF documents in Java applications without using Adobe Acrobat.
ptyadana/Python-Projects-Dojo
Collections of python projects including machine learning projects, image and pdf processing, password checkers, sending emails, sms, web scraping,flask web app,selenium automation testing,etc
ayushwattal/PDF-ChatGpt
Create a ChatGPT for uploaded pdf using Langchain
hoehermann/pypdf_strreplace
Search and replace text in PDF files with PyPDF.