
This repository contains the dataset associated with a UIST'23 submission (ID 2823).

The repo contains:

  1. All the original video frames (from 21 accessibility-related videos, divided into 81 segments)
  2. Ground Truth annotations for all frames of 31 video segments
  3. A list of accessibility-related objects
  4. Outputs of two VQA models (GPV-1 and BLIP), and their compairson with ground truth, when available