microsoft/UniVL

An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"

PythonMIT

Issues

Non-Configurable GPU Count via Arguments
#47 opened a year ago by willyfh
0
TypeError: bad operand type for unary -: 'list'
#27 opened 3 years ago by jxrloveyou
6
How to only input text feature or video feature
#40 opened 2 years ago by tingchihc
2
where to get transcript to generate youcookii_data.pickle
#35 opened 3 years ago by zhaoying9105
2
Zero score (every output is None) on evaluation captioning with pretrained model
#46 opened 2 years ago by Borntowarn
1
Weights from pretrained model not used in UniVL in evaluation. In EVALUATION, there is lack of visual_pytorch_model.bin, cross_pytorch_model.bin, decoder_pytorch_model.bin in visual-base, cross-base , decoder-base
#5 opened 4 years ago by lokeaichirou
12
end-to-end video file captioning process
#36 opened 3 years ago by mhyeonsoo
3
This repo is missing important files
#33 opened 2 years ago by microsoft-github-policy-service
1
Issues about Freezing some additional layers instead of meanP in CLIP4Clip
#43 opened 2 years ago by celestialxevermore
2
Estimate of zero-shot performance
#45 opened 2 years ago by bpiyush
1
Error message (torch.distributed.elastic.multiprocessing.errors.ChildFailedError:)
#44 opened 2 years ago by tingchihc
0
Is there a code for Finetune on CMU-MOSI here?
#41 opened 2 years ago by sen0902
1
video only test for youcook
#39 opened 2 years ago by mhyeonsoo
2
How can I create my video feature pickle
#38 opened 2 years ago by tingchihc
4
feature & data shape
#37 opened 3 years ago by mhyeonsoo
6
Unable to run video captioning code
#34 opened 3 years ago by Davidyao99
3
Can you share your HowTo100M.csv file?
#30 opened 3 years ago by ShinJQ
3
caption using features extracted from my raw video
#11 opened 4 years ago by dawnlh
6
Pre-training acceleration using multi-machine distributed training
#29 opened 3 years ago by mingtan2
1
Run Without Distributed
#26 opened 3 years ago by Maddy12
3
How to run captioning task on my own video datasets?
#28 opened 3 years ago by Kevinkaiyan
1
How to fine-tune with additional layers before UniVL?
#25 opened 3 years ago by CrystalSixone
2
Questions on retrieval result and "Info: Weight doesn't exsits"
#24 opened 3 years ago by HenryHZY
4
What's mean of the 'step_size=5' in modeling.py
#23 opened 3 years ago by saicoco
2
CrossTask and COIN dataset code
#22 opened 3 years ago by TXH-mercury
1
Joint loss in pretraining
#21 opened 3 years ago by zhangliang-04
1
About auto mixed precision training
#19 opened 3 years ago by zhangliang-04
1
How does the visual token come from?
#20 opened 3 years ago by renmada
1
Hyper-parameter in pretraining
#18 opened 3 years ago by zhangliang-04
2
About msrvtt retrieval results
#17 opened 3 years ago by zhangliang-04
1
Captioning task clarification: video vs. video+text for captioning task
#16 opened 3 years ago by tchang1997
2
The bert univl using is very different from Huggingfaces' (or pytorch's) bert
#15 opened 3 years ago by butterluo
0
The program hangs when runs into parallel_apply() function in util.py
#12 opened 3 years ago by butterluo
7
What's the role of the parameter coef_lr?
#14 opened 4 years ago by forence
1
About multi-gpu loss calculation
#13 opened 4 years ago by forence
10
caption my own video with provided pretrained model
#10 opened 4 years ago by dawnlh
8
How should I set the value in youcookii_videos_features.pickle when fine-tuning with single transcript as input?
#9 opened 4 years ago by lokeaichirou
2
Why is the fine-tuning performance much lower than benchmark in paper?
#8 opened 4 years ago by lokeaichirou
8
Is the provided weights based on the pre-trained work on Howto100M dataset?
#7 opened 4 years ago by lokeaichirou
2
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
#6 opened 4 years ago by lokeaichirou
3
CLip
#4 opened 4 years ago by johnbager
1
When will you release the code of 'Action Step Localization' and 'Action Segmentation' tasks?
#3 opened 4 years ago by butterluo
1
When will you release the pre-trained model？
#2 opened 4 years ago by menggehe
1
Expected data format?
#1 opened 4 years ago by mckinziebrandon
3