Pinned Repositories
AIR-Bench
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
ALERT
Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"
allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
arctic_shift
Making Reddit data accessible to researchers, moderators and everyone else. Interact with the data through large dumps, an API or web interface.
asr2k
asr2k
audio-dataset
Audio Dataset for training CLAP and other models
audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
audio-flamingo
PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.
audio-slicer
A simple GUI application that slices audio with silence detection
GuangkeChen's Repositories
GuangkeChen/Toxicity-Detection-in-Spoken-Utterances
This repository contains the code for the paper: "DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances"
GuangkeChen/Awesome-Singing-Voice-Synthesis-and-Singing-Voice-Conversion
A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (VC), Singing Voice Conversion (SVC), and related interesting works (such as Music Synthesis, Automatic Music Transcription, Automatic MOS Prediction, SSL-based ASR...etc).
GuangkeChen/asr2k
asr2k
GuangkeChen/syncnet_python
Out of time: automated lip sync in the wild
GuangkeChen/AuxiliaryASR
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)
GuangkeChen/SpeakerGuard
a Pytorch library for security research on speaker recognition, released in "Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition"
GuangkeChen/Emotional-Speech-Data
This is the GitHub page for publicly available emotional speech data.
GuangkeChen/REAPER
GuangkeChen/MyBeamer
A blue beamer theme
GuangkeChen/stargan
StarGAN - Official PyTorch Implementation (CVPR 2018)
GuangkeChen/EA-SVC
An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"
GuangkeChen/TCDTIMITprocessing
processing and extracting of face and mouth image files out of the TCDTIMIT database
GuangkeChen/Singing-Voice-Conversion-with-conditional-VAW-GAN
This is the implementation of the paper "VAW-GAN for Singing Voice Conversion withNon-parallel Training Data".
GuangkeChen/Singing-Voice-Conversion-JP
2019/04~2019/09 투빅스 Singing Voice Conversion
GuangkeChen/crubadan
Scripts and data for the Crúbadán web crawler: http://crubadan.org/
GuangkeChen/FlipBeamerTheme
Flip's Beamer Template (for presentations in LaTeX)