increase coverage
bertsky opened this issue · 0 comments
bertsky commented
-
Confidence
(unfortunately, this conflates Coords and Text@conf
) -
TextType
(HANDWRITING
→@production=handwritten-printscript|handwritten-cursive
,PRINTED
→@production=printed
) - support tables:
- top-level
TableRegion
forTABLE
block - recursive
TextRegion
forCELL
block (i.e.ColumnIndex
→Roles/TableCellRole/@columnIndex
,RowIndex
→Roles/TableCellRole/@rowIndex
) - recursive
TextRegion
forMERGED_CELL
block (i.e.ColumnSpan
→Roles/TableCellRole/@colSpan
,RowSpan
→Roles/TableCellRole/@rowSpan
) – diverging recursion between Textract and PAGE? - recursive
TextRegion
forTABLE_TITLE
andTABLE_FOOTER
block (i.e.Roles/TableCellRole/@header
... or via ReadingOrder) -
EntityTypes
–STRUCTURED_TABLE|SEMI_STRUCTURED_TABLE
(unclear how to represent in PAGE),TABLE_TITLE|TABLE_SECTION_TITLE|TABLE_FOOTER|TABLE_SUMMARY|COLUMN_HEADER
(unclear how this looks and compares with the actual recursiveBlockType
)? - also via ordered groups in ReadingOrder?
- unclear:
LineItemGroup
andLineItems
- top-level
-
PageClassification/PageType
(unclear, but probablyPage/@type
) - support forms
-
BlockType=KEY_VALUE_SET
andEntityTypes=KEY|VALUE
→ unclear how to represent: TableRegion or recursive TextRegion? Labels/Label?
-
- support checkboxes within tables or forms
-
BlockType=SELECTION_ELEMENT
andSelectionStatus=SELECTED|NOT_SELECTED
→ unclear how to represent
-
- ignore query type