qinqiang2000's Stars
getomni-ai/zerox
Zero shot pdf OCR with gpt-4o-mini
VikParuchuri/tabled
Detect and extract tables to markdown and csv
VikParuchuri/surya
OCR, layout analysis, reading order, table recognition in 90+ languages
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
gildas-lormeau/single-file-cli
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
gildas-lormeau/SingleFile
Web Extension for saving a faithful copy of a complete web page in a single HTML file
coolwanglu/pdf2htmlEX
Convert PDF to HTML without losing text or format.
benbalter/word-to-markdown-js
Convert Word documents to beautiful Markdown. Via command line or in your browser.
AlrasheedA/st-link-analysis
A custom Streamlit component for link analysis, built with Cytoscape.js and Streamlit.
ChrisDelClea/streamlit-agraph
A Streamlit Graph Vis
JonathanLink/PDFLayoutTextStripper
Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).
opendatalab/labelU
Data annotation toolbox supports image, audio and video data.
opendatalab/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
dataease/dataease
🔥 人人可用的开源 BI 工具,Tableau、帆软的开源替代。
Skyvern-AI/skyvern
Automate browser-based workflows with LLMs and Computer Vision
lfoppiano/streamlit-pdf-viewer
Streamlit PDF viewer
B4PT0R/streamlit_pdf_reader
Streamlit pdf reader component
aghasemi/streamlit_js_eval
A custom Streamlit component to evaluate arbitrary Javascript expressions
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Loyalsoldier/clash-rules
🦄️ 🎃 👻 Clash Premium 规则集(RULE-SET),兼容 ClashX Pro、Clash for Windows 等基于 Clash Premium 内核的客户端。
PablocFonseca/streamlit-aggrid-examples
Examples Repository for streamlit-aggrid
JensWalter/my-receipts
my personal receipts collected all over the world
katanaml/sparrow-donut
Data extraction with Donut ML model
katanaml/sparrow
Data processing with ML and LLM
andfanilo/streamlit-drawable-canvas
Do you like Quick, Draw? Well what if you could train/predict doodles drawn inside Streamlit? Also draws lines, circles and boxes over background images for annotation.
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
gauravjakhar/pdfchat
kinfey/SemanticKernelCookBook
Semantic Kernel's cook book
microsoft/semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps
agiresearch/AIOS
AIOS: LLM Agent Operating System