google-research-datasets/screen_annotation
The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and describe the UI elements present on the screen: their type, location, OCR text and a short description. It has been introduced in the paper `ScreenAI: A Vision-Language Model for UI and Infographics Understanding`.
Issues
- 1
How to get F1 score @ IoU=0.1?
#3 opened by luyy12 - 1
coordinates meaning
#2 opened by SivanDoveh - 1
How can I parse the annotation file?
#1 opened by hancheolcho