Merge_cell call does not work as expected
sawasume opened this issue · 2 comments
Hello
I am using this library to parse the json generated by textract. I have a lot of tables where there are merge cell and I need to get information from such cell and then create csv file from the table
below is the piece of the code i am using
with open(table_case_3, 'r') as pfz_doc:
pfz_textract_json=json.load(pfz_doc)
tdoc = Document(pfz_textract_json)
when I use this call to print the contents of a merge cell I get some text omitted from it
table.rows[i].cell[j].mergedText
Below is an image of cell by cell comparison of a table where the text of interst was 3 or 4 days but the above call only extracted 3 4
Another example where the merged cell text was 'supplied by'
you can see in the image both cell 1 and 2 of row 0 is displaying the same word supplied using mergedText call
where as row 0 cell 2 is displaying the word by using .text call
my expectation is both cell 1 and cell 2 of row 0 should display supplied by using mergedText call