Issues
- 0
Non-Configurable GPU Count via Arguments
#47 opened by willyfh - 6
- 2
How to only input text feature or video feature
#40 opened by tingchihc - 2
- 1
Zero score (every output is None) on evaluation captioning with pretrained model
#46 opened by Borntowarn - 12
Weights from pretrained model not used in UniVL in evaluation. In EVALUATION, there is lack of visual_pytorch_model.bin, cross_pytorch_model.bin, decoder_pytorch_model.bin in visual-base, cross-base , decoder-base
#5 opened by lokeaichirou - 3
end-to-end video file captioning process
#36 opened by mhyeonsoo - 1
- 2
Issues about Freezing some additional layers instead of meanP in CLIP4Clip
#43 opened by celestialxevermore - 1
Estimate of zero-shot performance
#45 opened by bpiyush - 0
Error message (torch.distributed.elastic.multiprocessing.errors.ChildFailedError:)
#44 opened by tingchihc - 1
Is there a code for Finetune on CMU-MOSI here?
#41 opened by sen0902 - 2
video only test for youcook
#39 opened by mhyeonsoo - 4
How can I create my video feature pickle
#38 opened by tingchihc - 6
feature & data shape
#37 opened by mhyeonsoo - 3
Unable to run video captioning code
#34 opened by Davidyao99 - 3
Can you share your HowTo100M.csv file?
#30 opened by ShinJQ - 6
caption using features extracted from my raw video
#11 opened by dawnlh - 1
- 3
Run Without Distributed
#26 opened by Maddy12 - 1
- 2
- 4
- 2
What's mean of the 'step_size=5' in modeling.py
#23 opened by saicoco - 1
CrossTask and COIN dataset code
#22 opened by TXH-mercury - 1
Joint loss in pretraining
#21 opened by zhangliang-04 - 1
About auto mixed precision training
#19 opened by zhangliang-04 - 1
How does the visual token come from?
#20 opened by renmada - 2
Hyper-parameter in pretraining
#18 opened by zhangliang-04 - 1
About msrvtt retrieval results
#17 opened by zhangliang-04 - 2
- 0
The bert univl using is very different from Huggingfaces' (or pytorch's) bert
#15 opened by butterluo - 7
- 1
What's the role of the parameter coef_lr?
#14 opened by forence - 10
About multi-gpu loss calculation
#13 opened by forence - 8
- 2
How should I set the value in youcookii_videos_features.pickle when fine-tuning with single transcript as input?
#9 opened by lokeaichirou - 8
- 2
Is the provided weights based on the pre-trained work on Howto100M dataset?
#7 opened by lokeaichirou - 3
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
#6 opened by lokeaichirou - 1
- 1
When will you release the code of 'Action Step Localization' and 'Action Segmentation' tasks?
#3 opened by butterluo - 1
When will you release the pre-trained model?
#2 opened by menggehe - 3
Expected data format?
#1 opened by mckinziebrandon