catherine-qian's Stars
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
UKPLab/sentence-transformers
State-of-the-Art Text Embeddings
aleju/imgaug
Image augmentation for machine learning experiments.
jacobgil/pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
bryandlee/animegan2-pytorch
PyTorch implementation of AnimeGANv2
EvelynFan/FaceFormer
[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
auspicious3000/SpeechSplit
Unsupervised Speech Decomposition Via Triple Information Bottleneck
ChanganVR/awesome-embodied-vision
Reading list for research topics in embodied vision
fgnt/nara_wpe
Different implementations of "Weighted Prediction Error" for speech dereverberation
CrisHY1995/headnerf
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".
okankop/vidaug
Effective Video Augmentation Techniques for Training Convolutional Neural Networks
facebookresearch/sound-spaces
A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.
haoheliu/voicefixer_main
General Speech Restoration
qiuqiangkong/panns_inference
balavenkatesh3322/audio-pretrained-model
A collection of Audio and Speech pre-trained models.
rhgao/ObjectFolder
ObjectFolder Dataset
penghu-cs/DSCMR
Deep Supervised Cross-modal Retrieval (CVPR 2019, PyTorch Code)
YuanGongND/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
jfsantos/SRMRpy
Python implementation of the SRMR toolbox
pedro-morgado/spatialaudiogen
Spatial Audio Generation
zehuachenImperial/SkipConvNet
Speech Dereverberation using Fully Convolutional Networks
nikhilsinghmus/image2reverb
[ICCV 2021] Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis.
DTaoo/Multimodal-Aerial-Scene-Recognition
Code for <Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition> (ECCV 2020)
ExplainableML/AVCA-GZSL
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language"
adobe-research/deep-acoustic-analysis
ariacat3366/ACVAE-VC
yyf17/SAAVN
SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)
gopala-kr/autoencoders
implementations of various types of auto-encoders in tensorflow(in progress)
nikhilsinghmus/speech2image
Attempting speech-driven image synthesis.