Pinned Repositories
community
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
pipeline-sec-filings
Preprocessing pipeline notebooks and API supporting text extraction from SEC documents
UNS-MCP
unstructured
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
unstructured-api
unstructured-inference
unstructured-ingest
unstructured-js-client
A JavaScript/Typescript client for the Unstructured Platform API
unstructured-python-client
A Python client for the Unstructured Platform API
unstructured.PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Unstructured's Repositories
Unstructured-IO/community
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Unstructured-IO/irs-manual-demo
Unstructured-IO/chat-isw-reports
Unstructured-IO/pipeline-receipts
Preprocessing pipeline notebooks and API supporting text extraction from receipts images
Unstructured-IO/unstructured.Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
Unstructured-IO/langchainjs
Unstructured-IO/prometheus-community-helm-charts
Prometheus community Helm charts