numpy.ndarray has no pop
Closed this issue · 2 comments
I tried out master, and it seems I have a numpy install issue.
python extracttab.py -i pdf.pdf -p 1
Traceback (most recent call last):
File "extracttab.py", line 482, in <module>
cells.extend(process_page(pgs))
File "extracttab.py", line 270, in process_page
vd.pop(i)
AttributeError: 'numpy.ndarray' object has no attribute 'pop'
Any ideas how to fix this?
It's true that numpy arrays don't support pop
import numpy ; numpy.zeros(10).pop()
AttributeError: 'numpy.ndarray' object has no attribute 'pop'
I guess that means the lines between 268 and 282 in the code were never hit by the test cases (both vd and hd are numpy arrays). Do you have large black borders in your PDF? The limit of maxdiv=10 is being exceeded which suggests the code found 'dividers' that are thicker than 10 pixels.
numpy does support 'delete' rather than pop. Could you update your repository and try the code again? This change (4a29df1) will still remove thick dividers, but might remove some table cells. If that happens, you need to change the 'maxdiv' value to something larger so the dividers are not ignored.
thanks, I get an output now. I will see what the result is.