BMKEG/lapdftext
LA-PDFText is a system for extracting accurate text from PDF-based research articles (and an interface to be able to improve performance where needed). The system is open-source and provides a simple baseline function for extracting text from primary research articles using rules that developers can customize. This means that the system works quite well for most applications (and might occasionally make mistakes and extract the wrong text), but it is always possible to 'hack' your own rules and improve performance.
JavaGPL-3.0
Issues
- 0
Executable file cannot be downloaded
#38 opened by brillience - 0
Is lapdftext dead? Is there an alive fork?
#37 opened by MartinThoma - 0
java null pointer error
#36 opened by aswaz - 9
Unable to build lastest version
#29 opened by oersted - 1
Skipping last page
#35 opened by jacobmyers-codeninja - 1
Build succeeded but unable to run commands
#32 opened by joshgavinhong - 0
separated sections into text file
#34 opened by naimavahab - 3
- 4
how use lapdftext?
#12 opened by juanbits - 7
some classes are missing
#33 opened by naimavahab - 0
lapdftext as a library; Play framework
#31 opened by vibhor-varshney - 0
Missing DebugLapdfFeatures
#30 opened by dwightkelly - 3
installers page not working
#22 opened by pats - 7
Compile fails
#8 opened by JohannesBuchner - 0
- 0
- 0
Regarding support for Chinese text
#25 opened by joanne-tseng - 0
- 1
- 3
Errors while extracting several files from a folder
#19 opened by jonzl - 2
BlockifyClassify a three column document
#21 opened by kinow - 1
Major.minor version error
#20 opened by vdavez - 0
- 18
- 0
Simplify the RuleBasedParser.buildChunkBlocks subroutine to speed up system
#15 opened by GullyAPCBurns - 1
Null pointer during parsing
#5 opened by GullyAPCBurns - 3
blockifyClassify doesn't produce any output
#9 opened by rahuljha - 3
problem to Classify pdf blocks
#13 opened by juanbits - 7
- 4
mvn test failure
#11 opened by juanbits - 1
colons and parentheses are removed
#7 opened by GullyAPCBurns - 0
- 1
Access through an API?
#3 opened by liar666