TXH-mercury/VALOR
Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
PythonMIT
Issues
- 1
download manage and control from youtube
#26 opened by frankymaxyyy - 4
- 1
Code to perform QA task
#25 opened by Dewmi24 - 6
Inference code
#19 opened by abhimanyu891998 - 6
- 5
A question about the optimizer:
#18 opened by HrealcodeH - 1
Providing all versions of pretrained weights
#21 opened by YingtianDt - 1
Questions about how to calculate metrics
#22 opened by aTunass - 5
Pre-training Data Release
#3 opened by vateye - 3
TypeError: __init__() missing 2 required positional arguments: 'stdout' and 'stderr'
#24 opened by cs-wangfeng - 1
- 1
- 1
Comparison between SoTA methods
#2 opened by MAGAer13 - 7
Inference Code
#7 opened by isjwdu - 0
- 0
RuntimeError: CUDA error: no kernel image is available for execution on the device
#20 opened by xibian1120 - 8
Plan to release finetuned models?
#11 opened by yt2639 - 1
Different Results on msrvtt-1kA
#17 opened by YasmineXXX - 2
Strange error, but it works normally
#9 opened by zsw111-zzz - 2
"Output file #0 does not contain any stream"
#10 opened by zsw111-zzz - 1
About prerequisite
#6 opened by isjwdu - 2