aws-samples/amazon-textract-textractor
Analyze documents with Amazon Textract and generate output in multiple formats.
Jupyter NotebookApache-2.0
Issues
- 2
feature request: add query alias parameter
#361 opened by parad0x96 - 2
cell content extraction error
#355 opened by Larbo53 - 1
- 1
Table cell, incorrectly, does not pick up the cell text/words. Page--> Line picks up the words as in the textract output
#358 opened by raidken - 2
Access Non-Axis-Aligned Bounding Boxes
#359 opened by zkalson - 1
Cryptic CLI error in SageMaker Studio (and probably other role-based environments?)
#352 opened by athewsey - 1
[Feature Request] Simplified batch processing CLI
#353 opened by athewsey - 0
Python Support for Column Headers
#351 opened by Belval - 1
Exporting text+tables while maintaining layout
#347 opened by austinmw - 0
KeyError in get_lines_string
#348 opened by sbui-dev - 0
- 1
- 1
- 8
- 5
Proper way of getting cell content?
#336 opened by ttruong-gilead - 1
Textractor import error
#338 opened by umaaaaaaaaa - 4
- 1
Missing CITATION.cff file for repo
#331 opened by mhucka - 0
Large PDF response processing is slow
#337 opened by Belval - 2
- 1
Queries ordering is not preserved after parsing
#328 opened by Belval - 0
- 0
Query entity is not linearizable
#327 opened by Belval - 2
Add python 3.12 support
#288 opened by tb102122 - 1
Caller: allow early return when job incomplete
#326 opened by symroe - 1
- 2
- 3
- 2
Mistake a text field above a table as table title
#318 opened by oonisim - 4
start_document_analysis high memory usage
#316 opened by ttruong-gilead - 1
- 1
Export document object to json
#309 opened by rnschmidt - 2
Issue with multipage PDFs on s3 without extension
#307 opened by lvieirajr - 3
Assertion error for larger PDF documents
#306 opened by rnschmidt - 2
In textract-pretty-printer tables to markdown conversion (sometimes) injects wrong table
#291 opened by dzmitry-kankalovich - 2
- 4
heuristic_line_break_threshold, along with other heuristic constants not doing anything
#294 opened by kostabasis - 1
No able to fetch Handwritten Text from Document
#300 opened by naconcirrus - 0
two line import fix for get_layout_csv_from_trp2
#299 opened by scott-norm - 1
Issue with Markdown output (textractprettyprinter)
#274 opened by jpbalarini - 5
linearize_table False doesn't exclude table
#293 opened by eilam-stream - 2
Confidence score output
#295 opened by kostabasis - 1
- 4
PrettyPrinter error with csv output
#284 opened by noefingway - 0
- 2
Change of return type broke my script
#282 opened by aka-rabbi-inv - 4
Parse specific pages only
#276 opened by austinmw - 1
Analyze a document with multiple pages
#281 opened by alexandruvesa - 3
I would like to be able to use textractor without a profile by passing access_key, secret_access_key instead
#280 opened by mariame-m - 0
Generate layout.csv
#275 opened by schadem