aws-samples/amazon-textract-textractor
Analyze documents with Amazon Textract and generate output in multiple formats.
Jupyter NotebookApache-2.0
Issues
- 0
- 6
- 1
`get_text_from_layout_json` throws `'NoneType' object is not subscriptable` for a specific PDF
#411 opened by neil-sola - 1
Textractor doesn't detect the INVOICE_RECEIPT_ID, but the AWS Textract Demo can
#408 opened by arsher-b - 2
Is search_words() broken?
#371 opened by ttruong-gilead - 0
error: Textractor.detect_document_text() got an unexpected keyword argument 's3_output_path'
#409 opened by elbbub - 8
- 2
- 6
KeyError: 'Relationships'
#406 opened by lucio-xelda - 1
The invoice number won’t be detected if there is no space between the label and the value
#397 opened by arsher-b - 1
lambda layers builds are broker
#399 opened by gauravthadani - 1
- 4
[textractprettyprinter] List contents are duplicated when generating text output using `get_text_from_layout_json`
#391 opened by adityachandak287 - 8
Trouble replicating markdown output
#384 opened by bvbg1 - 3
Incorrect order of text layouts due to compare_bounding_box() used in group_elements_horizontally()
#389 opened by keitaf - 0
- 3
Incorrect table cell word and line order
#369 opened by wessens - 4
issue regarding .to_markdown() method
#380 opened by red-sky17 - 1
Detected in EXPENSE_ROW but not as ITEM
#385 opened by arsher-b - 0
InvalidParameterException: Request has invalid parameters when using startDocumentAnalysis
#383 opened by arunsingh28 - 1
- 3
- 1
- 3
Lambda layers for Python 3.12 PDF raising an exception on missing libpng16.so.16
#373 opened by Viajante80 - 3
Save image doesn't work with S3 path - TypeError: Invalid input type 'bytearray'
#382 opened by steffeng - 3
Empty expense_documents on analyze_expense
#370 opened by arsher-b - 0
- 0
'NoneType' object has no attribute 'spatial_object' on Expense Analysis results
#368 opened by HarryTSaban - 2
feature request: add query alias parameter
#361 opened by parad0x96 - 2
cell content extraction error
#355 opened by Larbo53 - 1
- 1
Table cell, incorrectly, does not pick up the cell text/words. Page--> Line picks up the words as in the textract output
#358 opened by raidken - 2
Access Non-Axis-Aligned Bounding Boxes
#359 opened by zkalson - 1
Cryptic CLI error in SageMaker Studio (and probably other role-based environments?)
#352 opened by athewsey - 1
[Feature Request] Simplified batch processing CLI
#353 opened by athewsey - 0
Python Support for Column Headers
#351 opened by Belval - 1
Exporting text+tables while maintaining layout
#347 opened by austinmw - 0
KeyError in get_lines_string
#348 opened by sbui-dev - 0
- 1
- 5
Proper way of getting cell content?
#336 opened by ttruong-gilead - 1
Textractor import error
#338 opened by umaaaaaaaaa - 4
- 1
Missing CITATION.cff file for repo
#331 opened by mhucka - 0
Large PDF response processing is slow
#337 opened by Belval - 2
- 1
Queries ordering is not preserved after parsing
#328 opened by Belval - 0
- 0
Query entity is not linearizable
#327 opened by Belval - 1
Caller: allow early return when job incomplete
#326 opened by symroe