facebookresearch/LaViLa
Code release for "Learning Video Representations from Large Language Models"
PythonMIT
Issues
- 2
About Ego4d dataset
#37 opened by lwpyh - 1
May be repeatedly loade the checkpoints.
#32 opened by yyvhang - 1
EGTEA reproduce
#30 opened by jong980812 - 3
What is the source of WIT video dataset?
#35 opened by rixejzvdl649 - 1
Checkpoint of the pre-trained dual-encoder.
#36 opened by AlbertHuyb - 2
- 3
Run locally on multiple GPUs
#29 opened by maximotus - 0
Base narrator model
#33 opened by sarisel - 3
- 0
Cannot use Huggingface demo
#31 opened by fgvfgfg564 - 1
- 1
About the demo
#17 opened by aa7784171 - 0
Extracting spatial feature maps from LaViLa
#27 opened by vineetparikh - 2
Preprocessing of Ego4D for pretraining
#24 opened by chuyishang - 1
The pretraining weights of TSF-B/L (visual only) on EPIC-KITCHENS-100 and EGTEA.
#18 opened by daiguangzhao - 3
Pretrained weight of HowTo100M
#13 opened by HYUNJS - 2
Training/fine-tuning the narrator
#23 opened by tobyperrett - 3
LaViLa as feature extractor
#22 opened by deepsurbhi8 - 2
- 1
- 2
Question about preprocessing Ego4D
#16 opened by hyojinie - 1
Normalization values for CLIP models
#15 opened by Jazzcharles - 2
Narrator Training
#12 opened by Flaick - 1
Reproducing zero-shot eval results on EK100-MIR
#11 opened by melongua - 1
- 1
- 2
Training Time
#3 opened by mmaaz60 - 9
Segmentation fault when launching demo_narrator [was: Keys remapping seems not to work]
#4 opened by amessina71 - 3
Resized Version of EK100
#7 opened by SJTUwxz - 1
Training narrations for downloading
#5 opened by melongua - 4
Add models/demo to Hugging Face Hub
#1 opened by nateraw - 1
Git clone failing in Colab
#2 opened by nateraw