showlab/EgoVLP

Two questions on EgoVLP and EPIC-Kitchens 100

vineetparikh opened this issue · 2 comments

Hi there, great work! I'm trying to use the video backbone of EgoVLP alone to extract intermediate feature maps (for a downstream task) on EPIC-Kitchens 100 videos. Two questions:

  • Any demo code available to load just the video weights and extract embeddings without worrying about text? I only have the videos to start with.
  • How is LOCAL_RANK set? When running python -m run.test_epic -r pretrained/egovlp_ek100_zs.pth -d 0, I'm finding that LOCAL_RANK isn't actually set even if it's supposed to be. What parameters might I be missing? (the guide indicates I only need to do python run/test_epic but this runs into package import problems)

Edited second question b/c solved previous second question by getting captions from https://github.com/mwray/Joint-Part-of-Speech-Embeddings

@vineetparikh did you get a solution for your questions? I have a doubt about the same.

@vineetparikh If you are still figuring this out, I think that keeping --subsample video will give us only video embedding without caring about the text.