Pinned Repositories
AccentedSpeechRecognition
Experiments on speech recognition robustness to accents and dialects
asteroid
The PyTorch-based audio source separation toolkit for researchers
asv-subtools
An Open Source Tools for Speaker Recognition
attention_keras
Keras Layer implementation of Attention for Sequential models
AttentionIsOFFByOne
Implementation of "Attention Is Off By One" by Evan Miller
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Audiomer-PyTorch
A Convolutional Transformer for Keyword Spotting
AudioTagger
Deep Learning Neural Networks Final Project
Qifusion-net
The net mudule of Qifusion-Net: Layer-adapted Stream/Non-stream Model for End-to-End Multi-Accent Speech Recognition
JinmingChe's Repositories
JinmingChe/Qifusion-net
The net mudule of Qifusion-Net: Layer-adapted Stream/Non-stream Model for End-to-End Multi-Accent Speech Recognition
JinmingChe/attention_keras
Keras Layer implementation of Attention for Sequential models
JinmingChe/AttentionIsOFFByOne
Implementation of "Attention Is Off By One" by Evan Miller
JinmingChe/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
JinmingChe/auto_avsr
Auto-AVSR: Lip-Reading Sentences Project
JinmingChe/chinese_speech_pretrain
chinese speech pretrained models
JinmingChe/CIF-HieraDist
[INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation
JinmingChe/ColossalAI
Making big AI models cheaper, easier, and scalable
JinmingChe/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
JinmingChe/DARCN
The implementation of "A Recursive Network with Dynamic Attention for Monaural Speech Enhancement"
JinmingChe/dparn
JinmingChe/FFmpeg
Mirror of https://git.ffmpeg.org/ffmpeg.git
JinmingChe/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
JinmingChe/generative-ai-roadmap
生成式AI的应用路线图 The roadmap of generative AI: use cases and applications
JinmingChe/GenericTools
I put here tools that I use in different projects all the time, so I have them all centralized
JinmingChe/jieba
结巴中文分词
JinmingChe/Leveraging-Self-Supervised-Learning-for-AVSR
Official PyTorch implementation of paper Leveraging Unimodal Self Supervised Learning for Multimodal Audio-Visual Speech Recognition
JinmingChe/LPCNet
Efficient neural speech synthesis
JinmingChe/MTFAA-Net
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
JinmingChe/PerceptualAudio
Perceptual Metrics of Audio - perceptually relevant loss function. DPAM and CDPAM
JinmingChe/pytorch-metric-learning
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
JinmingChe/so-vits-svc
SoftVC VITS Singing Voice Conversion
JinmingChe/so-vits-svc-5.0
Core Engine of Singing Voice Conversion & Singing Voice Clone
JinmingChe/sound-separation
JinmingChe/TIM-Net_SER
[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition".
JinmingChe/VITS-fast-fine-tuning
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
JinmingChe/vits_chinese_0829
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support streaming out!
JinmingChe/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
JinmingChe/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
JinmingChe/Whisper-Finetune
微调Whisper语音识别模型,支持无时间戳数据训练,有时间戳数据训练、无语音数据训练。加速推理,支持Web部署、Windows桌面部署和Android部署