Pinned Repositories
asteroid
The PyTorch-based audio source separation toolkit for researchers
av-se
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
AVE-ECCV18
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
avspeech-downloader
AVSpeech downloader
awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
beamformers
Easy to use Beamformers for multi-channel speech separation/enhancement
co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
Conv-TasNet
DCASE2016-baseline-system-python
DCASE 2016 Baseline system, python implementation
YANGX1123's Repositories
YANGX1123/asteroid
The PyTorch-based audio source separation toolkit for researchers
YANGX1123/av-se
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
YANGX1123/AVE-ECCV18
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
YANGX1123/avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
YANGX1123/awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
YANGX1123/beamformers
Easy to use Beamformers for multi-channel speech separation/enhancement
YANGX1123/co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
YANGX1123/Conv-TasNet
YANGX1123/deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
YANGX1123/deep_lip_reading
Code and models for evaluating a state-of-the-art lip reading network
YANGX1123/Discriminative-Sounding-Objects-Localization
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
YANGX1123/Dive-into-DL-PyTorch
本项目将《动手学深度学习》(Dive into Deep Learning)原书中的MXNet实现改为PyTorch实现。
YANGX1123/facenet
Face recognition using Tensorflow
YANGX1123/Lipreading_using_Temporal_Convolutional_Networks
ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASSP'20 Lipreading using Temporal Convolutional Networks
YANGX1123/Localizing-Visual-Sounds-the-Hard-Way
Localizing Visual Sounds the Hard Way
YANGX1123/Looking-to-Listen-at-the-Cocktail-Party
Executable code based on Google articles
YANGX1123/Multi-Source-Sound-Localization
This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.
YANGX1123/MuSE
YANGX1123/new-pac
科学上网/自由上网/翻墙/软件/方法,一键翻墙浏览器,免费shadowsocks/ss/ssr/v2ray/goflyway账号/节点分享,vps一键搭建脚本/教程
YANGX1123/openMHA
The open Master Hearing Aid (openMHA)
YANGX1123/pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
YANGX1123/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
YANGX1123/speech_separation
Include some core functions and model to handle speech separation
YANGX1123/speechbrain
A PyTorch-based Speech Toolkit
YANGX1123/v2ray
最好用的 V2Ray 一键安装脚本 & 管理脚本
YANGX1123/VGGSound
VGGSound: A Large-scale Audio-Visual Dataset
YANGX1123/VisualVoice
Audio-Visual Speech Separation with Cross-Modal Consistency
YANGX1123/voicefilter
Unofficial PyTorch implementation of Google AI's VoiceFilter system
YANGX1123/voxceleb_trainer
In defence of metric learning for speaker recognition
YANGX1123/x-vector-pytorch
Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch