Parameters to specify number of columns and also each column's area
premadh opened this issue · 1 comments
This is a suggested code or documentation change, improvement to the code, or feature request
The package is great works in most conditions (many thanks for this) but also makes lazy that I don't want to wrangle misread pdf pages. Hence, I'd like to request below.
Provide a parameter/method to specify the number of columns; start and end co-ordinates of each column so that table is extracted properly. For some pdfs, I have found that columns are misaligned.
I would also support this improvement using the 'columns' argument in extract_tables does not always work well where some columns are populated with blank values for the initial rows.
One approach would be to use the area function but applied to each column on a pdf page.
Hope this enhancement can be incorporated to what is a really useful and effective package.