Pinned Repositories
dehyphen
📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF
pd3-flair
Flair's language models without unnecessary dependencies
pd3f
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
pd3f-core
📑 Python Package to reconstruct the original continuous text from PDFs with language models
pd3f-dataset-bmjv
Dataset of (mostly German) PDFs used to develop pd3f
pd3f-results
Results with pd3f on some PDF datasets
pd3f.com
📝 Website to advertise & document pd3f
pd3f's Repositories
pd3f/pd3f
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
pd3f/dehyphen
📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF
pd3f/pd3f-core
📑 Python Package to reconstruct the original continuous text from PDFs with language models
pd3f/pd3-flair
Flair's language models without unnecessary dependencies
pd3f/pd3f-dataset-bmjv
Dataset of (mostly German) PDFs used to develop pd3f
pd3f/pd3f-results
Results with pd3f on some PDF datasets
pd3f/pd3f.com
📝 Website to advertise & document pd3f