/pdf-text-extraction-benchmark

A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF documents, especially from scientific articles.

Primary LanguageTeXMIT LicenseMIT

Stargazers

No one’s star this repository yet.