pdf-to-json
There are 17 repositories under pdf-to-json topic.
DS4SD/docling
Get your documents ready for gen AI
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
run-llama/llama_parse
Parse files for optimal RAG
awesome-yasin/PDF-Verse
PDF Verse is a powerful web based PDF Editor with tools for editing, converting, and manipulating PDFs. Merge, compress, add or remove pages, or extract text using OCR technology. Convert PDF to DOC, Excel, PPT, JPG, PNG, Text and many more format as well and vice versa. PDF Verse also has user-friendly interface and wide range of features as well
NanoNets/ocr-python
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
electrovir/statement-parser
Parse bank and credit card statements
HoangTran0410/saoke_yagi
Sao kê của Mặt Trận Tổ Quốc Việt Nam (MTTQ) về việc hỗ trợ đồng bào sau bão Yagi
graphlit/graphlit
Graphlit Platform
Clearedge-AI/clearedge
Build a RAG preprocessing pipeline
clarekang/form-pdf2json
NodeJS library to convert JSON to PDF or vice versa
bytescout/pdf-extractor-sdk-samples
ByteScout PDF Extractor SDK source code samples
graphlit/graphlit-client-python
Python client library for Graphlit Platform
tahaygun/PDF-to-MongoDB
This project for converting books from PDF to Proper JSON objects by separating title and content. After you take your output, you can insert your JSON file in the database easily.
ajaycode/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Aniket965/ipuresult-cli
🛠️ ipuresult-cli is tool for creating json files from pdf result files 📚 of GGSIPU Results
hparreao/doclingconverter
Quick way to convert files (PDF, DOCX, HTML, PPTX, Images) to (MD, JSON, YAML) using Docling and Streamlit
graphlit/graphlit-client-typescript
TypeScript client for Graphlit Platform