pymupdf-fitz

There are 30 repositories under pymupdf-fitz topic.

malavika-suresh/multiple_pdf_comparison
This Python-based tool allows for efficient comparison of two or more PDF documents, highlighting the differences between them. It extracts and compares the words in the PDFs, ignoring whitespace differences, and highlights the changed, added, or missing words.
Language:Python5 1 0
pawankumar94/graphscribe-table-extractor
Graphscribe is an intelligent, LLM-powered document understanding system designed to extract structured insights from complex visual content such as statistical diagrams, charts, and graphs.
Language:Python3 1 00
vickypandey14/Convert-PDF-into-Image-By-Python
This Python script converts each page of a PDF document into separate image files. It utilizes the PyMuPDF library (fitz) to handle PDF operations and the Python Imaging Library (PIL) for image processing.
Language:Python3 1 01
das-amlan/PDF_Image_Extractor_Web_App
This is a simple web app that allows users to upload a PDF file, extract images from the PDF, and display the images in the web app.
Language:Python2 1 02
devbm7/QGen
Question Generator System
Language:Python2 1 01
ifte110/Serach_all_pdfs_by_string
Search through all pdf files in a folder for a specific keyword or string of keywords.
Language:Python2 1 0
atthharvva/PDF-Form-Reader
This Python script extracts information from PDF forms using OCR (Optical Character Recognition) and saves the extracted data into an Excel file. It is particularly designed for processing forms with checkboxes and textual fields. The script can handle variations in form structure and allows for easy customization to accommodate other PDF form type
Language:Python1 1 0
Kurama-90/GUI-PDF-to-Excel
PyQt5-based GUI application that allows users to convert PDF files into Excel files. The application provides multiple options for extracting data from PDFs, including tables, text, and OCR (Optical Character Recognition).
Language:Python1 1 00
mcagriaksoy/diff_merge_pdf
A tool for compare, merge, display difference and make OCR between the PDFs.
Language:Python1 3 01
Sazizi2025/PDF-Founder
Are you short on time?! Can't you search all the PDFs one by one for the content you want?! Well, PDF-Founder is here...
Language:Python1 1 00
FrancisLauriano/chatsoftex
Plataforma desenvolvida em Python que visa automatizar e agilizar o processo de avaliação de projetos de inovação tecnológica, utilizando inteligência artificial e critérios padronizados com base na Lei do Bem.
Language:Python0 1 00
IglesiasT/comparador-pdfs
Language:Python0 1 00
kalyaninagaraj/NFHS5
Python code to read, retrieve, analyze, and plot district-level findings from official (pdf) publications of the 5th National Family Health Survey of India
Language:Jupyter Notebook0 1 00
OtenMoten/pdf-alchemist
It's designed for transmuting PDFs into HTML. Harness the power of OCR, image processing, and web technologies to unlock the secrets within your PDF documents.
Language:Python0 1 00
ParthaPRay/pdf_text_extraction_json_section_subsection
This repo contains codes for extraction of PDF text to JSON to show section number, section title, section body content, footnote
Language:Python0 1 00
RomyJr/PDF_TXT_Word_research
This application simplifies PDF keyword searches, allowing users to easily find specific terms in files or folders. Results are displayed clearly, and the history feature enables quick review and filtering of past searches. Users can click on document links in the history to open them directly in the default PDF viewer.
Language:Python0 1 00
RomyJr/Retrocession_Detector
This application facilitates the comparison of two PDF files. Differences are presented in a table, color-coded as red (deletions), green (additions), and orange (moved text). Users can save the results in Excel format. It is designed to check whether annotations have been taken into account during the comparison process.
Language:Python0 1 00
ashutosh6500/Resume-Parser-AWS-Event-Driven-Workflow
This is simple event driven mini project based on different AWS services like Lambda,EC2,Dynamodb,S3,SNS etc
Language:Python1 0
bilalhameed248/PDF-Document-Extraction
Python PDF-to-HTML Converter: Transforming PDF Documents into Structured HTML Tags. - Feb 2022 - Jun 2023
Language:Python1 0
Deepcoders30/AI-CHATPDF
ChatPDF is a web application that lets users upload PDFs and ask questions about their content.
Language:TypeScript
gyan007/AI-Tutor-Platform
The AI Tutor Platform is an intelligent educational application built with FastAPI, Streamlit, LangChain, and Groq. It provides users with an AI-powered conversational tutor, auto-generated quizzes, and a file-based doubt solver. The platform includes user authentication and progress tracking, with all data persistently stored in a PostgreSQL DB.
Language:Python
helgesander02/TKFruitMG
An ERP system that uses customtkinter as the GUI base, with a postgreSQL database and reportlab, win32print, and pymupdf-fitz design.
Language:Python1 0
Jatin-s16/Resume-check-portal-for-candidates
A Streamlit-based application that enables job seekers to evaluate and enhance their resumes by analyzing alignment with specific job descriptions, providing actionable insights for improvement.
Language:Jupyter Notebook
Lazarokaua/Organiza-pasta-obsidian
Organização de arquivos para meu Obsidian
Language:Python
Madhu-1106/ResumeGenie
Resume Coach,
Language:Python
MelinaNorton/journal-vetter
Python CLI & library for automated journal vetting — GPT‑4.1 summarization, YAML configuration, reproducible analysis.
Language:Python
micheldpd24/rag_aph_hippocrate
RAG / Chatbot IA sur les Aphorismes d'Hippocrate
Language:Jupyter Notebook
nngel/PDF-thumbnail-service
A production-ready FastAPI microservice that functions as a PDF thumbnail generator, converting the first page of PDF files to optimized PNG thumbnails.
Language:Python
raju-2003/KSP-DATATHON-24
Data Privacy in Law Enforcement - KSP DATATHON - 2024 - FIR Redactor
Language:Python1 0
RishavKumarSinha/adobe-hackathon-solution
Solution for the Adobe India Hackathon 2025, Team - Codient (Team Leader - Gopal Ranjan, Team Members - Rishav Kumar Sinha)
Language:Python

pymupdf-fitz

malavika-suresh/multiple_pdf_comparison

pawankumar94/graphscribe-table-extractor

vickypandey14/Convert-PDF-into-Image-By-Python

das-amlan/PDF_Image_Extractor_Web_App

devbm7/QGen

ifte110/Serach_all_pdfs_by_string

atthharvva/PDF-Form-Reader

Kurama-90/GUI-PDF-to-Excel

mcagriaksoy/diff_merge_pdf

Sazizi2025/PDF-Founder

FrancisLauriano/chatsoftex

IglesiasT/comparador-pdfs

kalyaninagaraj/NFHS5

OtenMoten/pdf-alchemist

ParthaPRay/pdf_text_extraction_json_section_subsection

RomyJr/PDF_TXT_Word_research

RomyJr/Retrocession_Detector

ashutosh6500/Resume-Parser-AWS-Event-Driven-Workflow

bilalhameed248/PDF-Document-Extraction

Deepcoders30/AI-CHATPDF

gyan007/AI-Tutor-Platform

helgesander02/TKFruitMG

Jatin-s16/Resume-check-portal-for-candidates

Lazarokaua/Organiza-pasta-obsidian

Madhu-1106/ResumeGenie

MelinaNorton/journal-vetter

micheldpd24/rag_aph_hippocrate

nngel/PDF-thumbnail-service

raju-2003/KSP-DATATHON-24

RishavKumarSinha/adobe-hackathon-solution