- 0
AttributeError: 'NoneType' object has no attribute 'encode' with load_file
#390 opened by umaplehurst - 1
Does not install on Python 3.11+
#381 opened by atompkins - 1
Py PDF Parser tests are distributed in PyPi wheels
#383 opened by aiden2480 - 2
ValueError on empty ElementList.after()
#386 opened by aiden2480 - 0
- 0
- 0
Switch to trusted publishing on PyPI
#370 opened by jstockwin - 1
Standardizing contains and equals
#373 opened by AndersWoodruff - 2
Filter text ignoring case
#371 opened by ARandomPerson07 - 2
Unable to create an PDFDocument object
#369 opened by papstchaka - 1
How to run the lint process?
#351 opened by dantehemerson - 2
- 1
ElementList filter on visualise function does not work
#255 opened by mcrts - 9
Document regular expression font mapping
#237 opened by Aceto1 - 1
- 0
Add more tests for the visualise tool
#219 opened by jstockwin - 9
keep getting an error when trying to visualise
#204 opened by rannndom - 2
Use of Visualize
#122 opened by dpieski - 3
Release v0.8.0?
#200 opened by AldenPeterson - 2
Element extraction in original order
#190 opened by zheyaf - 0
[loaders] Loads accept LTTextLines as top level pdfminer elements, which breaks things
#154 opened by jstockwin - 3
Unable to install with pip3
#123 opened by chookity-pokk - 0
Ensure CI runs on PRs
#118 opened by jstockwin - 9
- 1
Consider using sorted sets?
#114 opened by jstockwin - 0
Update when to use py-pdf-parser documentation
#108 opened by jstockwin - 0
- 5
What is the prefix:'PCAGML' of font:"PCAGML+SourceHanSerifCN-Regular,16.0"?
#100 opened by forhonourlx - 0
Too large a tolerance causes an error
#102 opened by jstockwin - 0
Add code coverage checks to CI
#104 opened by jstockwin - 0
Include text which is within figures
#98 opened by jstockwin - 0
Allow different element orderings
#94 opened by jstockwin - 0
Finish the info screen on visualise tool
#93 opened by jstockwin - 0
Use LTChar.size to extract the font size
#92 opened by jstockwin - 0
Add __repr__ to section class
#63 opened by jstockwin - 2
Add feature to remove duplicate header rows
#76 opened by jstockwin - 4
`create_section` should throw a better error if it isn't passed a `PDFElement`
#75 opened by jstockwin - 1
[performance] Disable advanced layout analysis
#50 opened by jstockwin - 0
- 0
Publish to PyPI
#79 opened by jstockwin - 1
[tests] Create some tests which use real PDFs
#56 opened by jstockwin - 0
Add some examples to the documentation
#48 opened by jstockwin - 0
Filtering by fonts is broken
#77 opened by jstockwin - 3
Cache filtering by font
#64 opened by jstockwin - 0
- 0
Better visualisations of sections
#66 opened by jstockwin - 0
Extract simple table could be more efficient
#62 opened by jstockwin - 0
Change font sizes to floats
#59 opened by jstockwin - 0
- 0
Run tests on GitHub Actions
#47 opened by jstockwin