catherine-qian

catherine-qian's Stars

lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Language:Python19.6k 147 2623k
UKPLab/sentence-transformers
State-of-the-Art Text Embeddings
Language:Python14.8k 140 2.1k2.4k
aleju/imgaug
Image augmentation for machine learning experiments.
Language:Python14.3k 230 5152.4k
jacobgil/pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Language:Python10.2k 44 4081.5k
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
Language:Python8.6k 132 1.1k1.4k
bryandlee/animegan2-pytorch
PyTorch implementation of AnimeGANv2
Language:Jupyter Notebook4.4k 60 56641
EvelynFan/FaceFormer
[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
Language:Python782 15 101133
auspicious3000/SpeechSplit
Unsupervised Speech Decomposition Via Triple Information Bottleneck
Language:Python636 23 7192
ChanganVR/awesome-embodied-vision
Reading list for research topics in embodied vision
495 15 165
fgnt/nara_wpe
Different implementations of "Weighted Prediction Error" for speech dereverberation
Language:Python473 18 37165
CrisHY1995/headnerf
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".
Language:Python425 18 2547
okankop/vidaug
Effective Video Augmentation Techniques for Training Convolutional Neural Networks
Language:Python383 8 1978
facebookresearch/sound-spaces
A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.
Language:Python339 16 14155
haoheliu/voicefixer_main
General Speech Restoration
Language:Python273 11 1854
qiuqiangkong/panns_inference
Language:Python190 4 1529
balavenkatesh3322/audio-pretrained-model
A collection of Audio and Speech pre-trained models.
179 3 024
rhgao/ObjectFolder
ObjectFolder Dataset
Language:Python148 6 510
penghu-cs/DSCMR
Deep Supervised Cross-modal Retrieval (CVPR 2019, PyTorch Code)
Language:Python141 5 1226
YuanGongND/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
Language:Python137 1 1216
jfsantos/SRMRpy
Python implementation of the SRMR toolbox
Language:Python112 3 840
pedro-morgado/spatialaudiogen
Spatial Audio Generation
Language:Python97 6 219
zehuachenImperial/SkipConvNet
Speech Dereverberation using Fully Convolutional Networks
Language:Python65 4 044
nikhilsinghmus/image2reverb
[ICCV 2021] Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis.
Language:Python64 4 137
DTaoo/Multimodal-Aerial-Scene-Recognition
Code for <Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition> (ECCV 2020)
Language:Python35 3 39
ExplainableML/AVCA-GZSL
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language"
Language:Python33 4 21
adobe-research/deep-acoustic-analysis
Language:Python26 5 39
ariacat3366/ACVAE-VC
Language:Jupyter Notebook22 3 45
yyf17/SAAVN
SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)
Language:Python16 2 20
gopala-kr/autoencoders
implementations of various types of auto-encoders in tensorflow(in progress)
9 5 04
nikhilsinghmus/speech2image
Attempting speech-driven image synthesis.
Language:Jupyter Notebook3 0 00