pdf-miner
There are 8 repositories under pdf-miner topic.
ktaaaki/paper2html
Converts a single/double-column PDF formatted paper into a html page, which has the original view & the paragraph view extracted from the paper for translation from the browser.
luke-cha/diff-pdf
Compare PDF documents using PDF Miner and print out the differences as HTML documents
swainshashwat/Flock
Craft custom Language Model Models (LLMs) effortlessly using Flock. Build LLMs for specific domains like a pro, supported by wizardlm, bloom, falcon, and llama. Extract insights from text and images seamlessly. Powered by Python, pdfMiner, langChain, and streamLit. Unlock domain-specific intelligence with Flock! 🚀
department-of-veterans-affairs/DAPM-PFAS-PACT-ACT
Scrapes hazardous waste data from a website and PDF file for PACT Act. Cleans the data to prepare it for mapping.
plain-jane-gray/PFAS-web-and-PDF-scrape
Scrapes hazardous waste data from a website and PDF file. Cleans and analyzes the data. Prepares the data for mapping.
TheurgicDuke771/pdf_compare
Compare PDF documents using PDF Miner and print out the differences as HTML documents
MyreLab/python_filereader
Data management automation tool. PyPDF2 reads unique identifiers from files and the OS library renames the files in-place with each corresponding identifier.
ritikkanswal/resume-filter
This Resume Filter is used for filtering resumes according to keywords of the recruiter. It is already hosted on Heroku Check it.