google-deepmind/tapnet

BootsTAP Training Dataset

Closed this issue · 1 comments

Hello,

Thanks for the great work on BootsTAP and for releasing the pytorch version!

Are there any plans on releasing the training dataset of real videos used in BootsTAP?

Thanks in advance.

Unfortunately for legal/policy reasons we aren't able to release the video dataset.

However, it's worth pointing out that we did very little tuning of our choice of videos; they were actually originally scraped for a different purpose and remained unchanged for most of the project. The main guiding principles are that 1) we wanted real-world data with interesting motion, 2) we wanted to avoid cuts, since we expect temporal continuity is a useful prior for point tracking, and 3) we wanted to avoid text and other overlays, as they tend to be semi-transparent and violate the assumption that a query point can unambiguously determine a point in the scene. There are many ways to get such a dataset, and I don't expect that our method will be very sensitive to the exact choice of data. In practice, we find that training on just 1% of our dataset (~150K clips) gives results that are very close to our original results, so hopefully you won't find it terribly difficult to generate an equivalent dataset.