joaanna/something_else

Which frame_list do you use when you train I3D and STRG?

Mihuadong opened this issue · 2 comments

First of all, thank you for your excellent work!
I notice that there are two different frames_list, coord_frame_list and frames_list. And they are obtained with different sampling strategies. I want to know that which one do you use to obtain the frames for I3D and STRG. Thank you!!

Hi, we sample frames using frames_list and coordinates using coor_frame_list, i.e. the appearance models (I3D or STRG) will use frames samples from frames_list and the STIN networks will use bbox information from frames samples from coord_frame_list. We found that the appearance model benefits from uniformly sampled frames as obtained by frames_list.

Thanks for the clarification!
I also wonder is there any specific reasons that coord_frame_list is half the length of frames_list?
The default argument num_frames was 4, and sample_rate (line 26 in data_loder_frames.py) was 2, which means only 4 frames would be selected for frame_list (appearance models) and only 2 would be selected for coord_frame_list. Is there suggestions for the choice of num_frames?