document-analysis
There are 85 repositories under document-analysis topic.
ethanhezhao/MetaLDA
The code for MetaLDA in ICDM 2017
soduco/paper-ner-bench-das22
All the material (paper, code, dataset, results) of our DAS 2022 paper (OCR+NER benchmark)
AymurAI/backend
This repository contains the backend API and machine learning models of AymurAI
georgeretsi/SegFreeKWS
Segmentation Free Keyword Spotting
TUWien/ReadFramework
The Core Framework for CVL/READ Modules
ZeroBone/OfficialEye
An advanced AI-powered generic document-analysis tool
dev-luckymhz/AIVisionText-invoice-OCR-typescript
AIVisionText is an advanced document analysis platform that harnesses the power of artificial intelligence (AI) to revolutionize the way you manage and extract insights from documents.
MBAigner/GraphConverter
A tool for creating a graph representation out of the content of PDFs or images.
nicolasfeyer/KWS-SIFT
Python code to perform keyword spotting using SIFT features
omni-us/research-ContentDistillation-HTR
Source code for ICFHR20 "Distilling Content from Style for Handwritten Word Recognition"
abdur75648/urdu-synth
High-quality synthetic text data generation for Urdu Text Recognition
faizan1041/doc-understanding-gpt-langchain
Document understanding with GPT 3.5 integrated with Telegram
Schlafenhase/Document-Analyzer
CE-5505. Company document analysis w/ natural language processing for sensitive data detection. #Isaac
sohaib023/T-Truth
Labeling tool for Table Structures in Document Images.
qurator-spk/sbb_column_classifier
Get the number of columns for a document image
sohaib023/Truth-Py
Python module for handling XML files labelled using T-Truth tool.
aquilu/muisca
Muisca: Modelo Unificado de Inteligencia Supervisada para la Computación y Aplicación. Una herramienta Streamlit para resumir y hacer preguntas sobre documentos en PDF y TXT utilizando modelos de lenguaje de última generación.
baharsateli/Dissertation_Supplementary_Materials
Datasets, tools and results from my doctoral dissertation
chulwoopack/Zone2OCR
Mapping a set of zones generated by a segmentation algorithm to the regions generated by OCR engine
fredrikwahlberg/das2018
Code for the paper "Gaussian Process Classification as Metric Learning for Forensic Writer Identification", published at DAS 2018
HamzaGbada/direct-neighbor-vrd
This repository contains the official implementation of the paper titled "Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural Network", which was presented in the 18th International Conference on Document Analysis and Recognition (ICDAR 2024).
JuanCarlosMartinezSevilla/MuRET-UserTool-deprecated
The objective of this repository is to provide MuRET's users a simple way to train deep learning models allowing an efficient transcription process.
LATIS-DocumentAI-Team/DocumentAI-std
DocumentAI-std is a Python library designed to facilitate and standardize document analysis and processing tasks. It offers functionality for handling document elements, performing optical character recognition (OCR), and managing document datasets.
miku/grobidclient
A Go (golang) client for GROBID.
MILE-IISc/DegradedWordsKannada
Benchmarking dataset of degraded word images (with character splits) in Kannada along with their associated ground truth Unicode text
MILE-IISc/MergedSymbolsKannada
Benchmarking dataset of merged symbols in Kannada along with their associated ground truth Unicode text
moured/Document-Graphics-Digitization
official repo for the ICDAR 2023 paper "Line Graphics Digitization: A Step Towards Full Automation"
billyotieno/haki-tech
This is a repository of legal tech startup activities and projects.
Leg0shii/smart-documents
A web application that enables users to upload documents and utilize AI techniques like semantic search and text summarization for efficient analysis. Built with Python, FastAPI, Svelte, PostgreSQL, and LangChain.
teohsinyee/resume-parsing
Record process to build pipeline for resume parsing.
x1ao4/doc-merger
通过 python 脚本将两个相对不完整的文档合并为一个完整的文档 / merge two relatively incomplete documents into one complete document via python script
yasuhiroinoue/Gemini_Discordbot_VertexAI
A Discord bot powered by Google Gemini Pro, capable of text generation, image analysis, audio transcription, and more.
marmiskarian/Documents-Timeline-Generation
Event-based Timeline Generation Tool for Document Analysis based on OpenAI APIs.
AlinaBaber/Data-Science-and-Insight-Agent-RAG-LLama3-Lava-LLM-Django-Api
Data-Science-and-Insight-Agent-RAG-LLama3-Lava-LLM-Django-WebApplication is an advanced AI-driven chatbot designed to assist in data science, document analysis, and image interpretation. This repository contain the Django based rest apis of this project.
AlinaBaber/Document-Analysis-Pipeline-with-RAG-Vector-Database-and-Mistral-LLM
This pipeline is a comprehensive document analysis system, designed to automate the processing and analysis of documents from acquisition to consumption. It integrates advanced machine learning & AI models like RAG (Retrieval Augmented Generation) & Mistral LLM to efficiently extract, match, enrich, process document
sourceduty/Opinionated_Analysis_Report
📄 Create an opinionated analysis report of a document, influenced by personality traits.