/trickypdf

Turn pdf document into simple annotated XML for further processing in a corpus preparation pipeline.

Primary LanguageR

Watchers