ispras/dedoc
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
PythonApache-2.0
Stargazers
- AigizK
- alexpdevCalifornia
- AndrianoTurner
- anfedotoff@yandex-cloud
- apach301Prague
- avtomatikMoscow, Russia
- bmefder
- capncodewashGlasgow
- Cool1097
- DaryaPopova
- de3Jakarta
- dimas3452Russia
- dyscarnate
- enzetInstitute for System Programming RAS
- epodak
- gromnero
- homocomputeris
- inexyomg
- irina-is
- ivanstepanovftwTurkey
- johnnypeaWebikon
- kirilltobolaISDCT SB RAS
- m4gshmRaiffeisen Bank
- murashov-aRussia, St. Petersburg
- NastyBogetISP RAS
- nirname
- PattriarchRussia
- SecNoticeSecNotice
- shchekleinIterative.ai
- socloseeeeСБЕР
- sunveilISDCT SB RAS
- SweetVishnya@yandex-cloud
- Travvy88ISPRAS
- turdakovISPRAS
- uco-physics
- vsagelimit