apple/ml-hypersim

Is `camera_keyframe_frame_indices.hdf5` just a continuous array from 0 to the number of frames?

Closed this issue · 1 comments

Hi Mike, thank you very much for your great work!

I have a small question: what is camera_keyframe_frame_indices.hdf5 used for? I just read and printed ai_003_010/_detail/cam_00/camera_keyframe_frame_indices.hdf5 and found it like this:

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

I am puzzled, is camera_keyframe_frame_indices.hdf5 just a continuous array from 0 to the number of frames? And if yes, why saved it? If no, what is it?
Thanks in advance!

Hi! Great question. You are correct that camera_keyframe_frame_indices.hdf5 just stores range(100), so in the case of our public dataset, this file is redundant and does not need to be stored explicitly.

So why do we bother storing this data? We do so because of how the rest of our rendering pipeline works. Our pipeline is capable of rendering more images than camera pose keyframes, and therefore requires the user to specify a frame ID for each camera pose. For example, if camera_keyframe_frame_indices contained the values [0, 10, 30, 1000], our pipeline would expect 4 camera poses, and would render a total of 1000 frames. Frames 0-9 would linearly interpolate between camera poses 0-1; frames 10-30 would linearly interpolate between camera poses 1-2; and frames 30-1000 would linearly interpolate between camera poses 2-3.

We ultimately decided that we didn't want to linearly interpolate camera poses in our public dataset, i.e., we wanted every camera pose to be stored explicitly. But we kept the interpolation functionality in our code because we think it's useful. And of course it's totally optional. If you don't want interpolation, you can always supply N camera poses and set camera_keyframe_frame_indices to range(N).

All of that being said, just because our code requires the user to specify a potentially trivial camera_keyframe_frame_indices file, that doesn't mean that we needed to include it in our public release. But we chose to do so to make it as easy as possible for anyone who wants to re-run our code.