Pinned Repositories
audiocaps
🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps
UniVL-DR
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval".
SPICE
Semantic Propositional Image Caption Evaluation
HTS-Audio-Transformer
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
AudioFile
Large-Audio-Models
Keep track of big models in audio domain, including speech, singing, music etc.
VLN_Pretraining
Contrastive Language-Image Pretraining
wangqian621.github.io
WavCaps
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
wangqian621's Repositories
wangqian621/AudioFile
wangqian621/Large-Audio-Models
Keep track of big models in audio domain, including speech, singing, music etc.
wangqian621/VLN_Pretraining
Contrastive Language-Image Pretraining
wangqian621/wangqian621.github.io
wangqian621/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
wangqian621/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
wangqian621/X-ACE