pdfplumber

There are 53 repositories under pdfplumber topic.

  • jaspreetsidhu3/text_to_mp3-audiobook

    Convert PDF into an audiobook.

    Language:Python14111
  • avr2002/CV-JD-Matching

    Extracting details from Resume(CVs) and matching with Job Description(JDs) using pretrained model like DistilBERT and ranking them using cosine similarity.

    Language:Jupyter Notebook7101
  • jwest951227/extractorChinese

    NLP model for extracting chinese datas from the documents

    Language:Python510
  • renan-siqueira/python-pdf-tool

    This project facilitates the extraction of text from PDF files using various Python libraries. It is designed to be flexible, allowing the choice among different text extraction libraries and supporting both single PDF file and directory containing multiple PDF files.

    Language:Python5101
  • vikrantRajan/python-projects

    This is my exploration of a variety of Python 🐍 libraries. I have built geospatial data analytics systems from CSV files, Image and video processing tools like face detection and motion detection. I also built a website with flask (and three.js), I built apps connecting to several types of databases. Created a simple budgeting app that reads, writes and updates .txt files. I also created a simple graphic user interface for Mac.

    Language:HTML4200
  • eli64s/pypdf

    Common Python PDF parsing utilities 📑

    Language:Python3100
  • AAC-Open-Source-Pool/Text-Summarization-and-information-extraction

    Interface developed to extract information from web through scraping and summarize given data.

    Language:Python2100
  • AnnaMihailovna/pdfplumber

    Конвертер файлов из PDF в mp3

    Language:Python210
  • MoinDalvs/Resume_Classification

    Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention

    Language:Jupyter Notebook210
  • Paymir121/RreportTheoClio

    Программа парсит несколько pdf-отчетов, ищет необходимую информацию о серии и флаконах, формирует отчет и создает excel-файл с отчетом.

    Language:Python2100
  • Pevicsanch/project-data-of-the-territorial-division-of-Barcelona

    collecting data from the Barcelona City Hall Open Data Service's on socioeconomic indicators of the territorial division of the city of Barcelona

    Language:Jupyter Notebook2101
  • plain-jane-gray/scraping-tables-from-PDF

    Scrapes data tables from a PDF file.

    Language:Jupyter Notebook2100
  • QQlesQ/PDF-to-MP3-audio-book-convertor

    PDF to MP3, audio book convertor

    Language:Python2100
  • serhatci/data-extraction-from-pdf

    A sample script to extract text data from a pdf file, converts it to a pandas data frame, and saves it into a CSV file.

    Language:Python2200
  • YuCheng21/score-analyse

    學生學程查詢系統

    Language:CSS2100
  • department-of-veterans-affairs/DAPM-PFAS-PACT-ACT

    Scrapes hazardous waste data from a website and PDF file for PACT Act. Cleans the data to prepare it for mapping.

    Language:Jupyter Notebook1202
  • Laith-Alayassa/Odyssey-helper

    Flask app that creates reecognizes overude items through parsing a an automated PDF from the checkout system, and generates emails for late users to bring the items back

    Language:Python110
  • marciohssilveira/observatorio

    Organizar extratos de noticias de arquivos pdf

    Language:Jupyter Notebook1102
  • MDule/parse-pdf-gui

    GUI app for parsing specific PDF files (data from standardized Vehicle Registration smart card - Republic of Serbia) and generating data file for specific use case.

    Language:Python1100
  • plain-jane-gray/PFAS-web-and-PDF-scrape

    Scrapes hazardous waste data from a website and PDF file. Cleans and analyzes the data. Prepares the data for mapping.

    Language:Jupyter Notebook1100
  • praveen2410-pk/PDF_Comparsion

    This repository contains a Python script for comparing PDF files between a local source folder and a remote server. The script logs results, highlighting identical and non-identical files based on size and page count. It employs "pdfplumber" for PDF handling and "paramiko" for SSH connections.

    Language:Python1210
  • VaibhavDongre1311/End_to_end_Resume_Classification__project

    Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention

    Language:Jupyter Notebook1100
  • VCTR09/PDF-to-mp3_converter

    Конвертирует PDF файлы в mp3. Использует графический интерфейс.

    Language:Python1100
  • wolfsbane9513/knowledgegraph

    To create knowledgegraph from pdfs

    Language:Jupyter Notebook1201
  • yyhsong/iPyLibs

    Python常用第三方库合集

    Language:Python1201
  • Form-16-PDF-DataExtract

    Anshuman7t/Form-16-PDF-DataExtract

    A Python script to extract and parse tax details from a Form 16 PDF file using pdfplumber and regular expressions.

    Language:Python0100
  • CmilAmaya/flight-dashboard-app

    This project is an application that processes attached PDF documents containing flight information and extracts relevant data. The data is stored in a PostgreSQL database and visualized on a dynamic dashboard using Streamlit.

    Language:Python00
  • davidepaci/linea138

    web app to query Cosenza bus timetable

    Language:EJS00
  • irini-git/French_people_outside_France

    French people living outside France

    Language:Python00
  • marceloakalopes/Timetables-Analysis

    The goal of this project was to analyze timesheet data from various programs at Sault College to determine the peak hours when the most programs have classes on campus simultaneously.

    Language:Python00
  • pinktaty/EpidemiologicLibrary

    This project was done under the continued supervision and mentorship offered by the program ”Carreras Con Impacto” in order to be completed in a span of 12 weeks.

    Language:HTML0100
  • Saherpathan/invoicify-ai-cohere

    A Flask application that extracts invoice details from uploaded PDFs and images using LLM inference API

    Language:Python00
  • Aaryan015/Document-reference-tracker

    Brief document reference tracker

    Language:Python
  • Adii2202/LIfE

    Life Inclusion For Everyone

    Language:Dart00
  • mattfarnan/CTFS_PDF_Extract

    This script parses Canadian Tire Financial Services (ctfs.com) PDF statement files and extracts the relevant transaction information which is then categorized and appended to an Excel file

    Language:Python10
  • RustamovAkrom/HR-question

    🔴This is my work that was given to me by the recruiter.

    Language:Python