CAMMA-public/cholectriplet2021

Issue with data loader

Closed this issue · 2 comments

In the Data Loading and Visualization cells there is this line:

for line in reader: line = np.array(line, np.uint8)

uint8 ranges from 0 to 255, so frame index above 255 is being truncated during the cast to uint8 i.e.

in: np.array([256], np.uint8)
out: [1]

This results in a data set of mostly repeating frames, with the wrong labels. This is not an issue for the sample dataset which has a max frame index less than 255, but if someone adopts this code for the competition training set they will have this issue.

Side note, there are 3 videos that have a resolution of 1080x1920, videos [78,79,80].

Hey, nice catch. We have edited the notebook to avoid any potential confusion.
About the video resolutions, we provide the videos at the acquired resolutions and allow participants to use the data however they see fit.

Thanks for your quick response and edit!