Pinned Repositories
3D-convolutional-speaker-recognition
A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement
A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch
adaptive-softmax-pytorch
Adaptive Softmax implementation for PyTorch
AdaptiveSoftmax
This is an implement of Adaptive Softmax with pytorch.
AEC-Challenge
AEC Challenge
AI-Audio-Datasets-List
This is a dataset of speech, music and sound effects that can provide training data for AIGC, AI model training, intelligent audio tool development, and audio applications. The audio dataset is mainly used in speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, sound synthesis, etc
articulated-animation
Code for Motion Representations for Articulated Animation paper
asteroid
The PyTorch-based audio source separation toolkit for researchers
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
lizezheng's Repositories
lizezheng/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
lizezheng/AI-Audio-Datasets-List
This is a dataset of speech, music and sound effects that can provide training data for AIGC, AI model training, intelligent audio tool development, and audio applications. The audio dataset is mainly used in speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, sound synthesis, etc
lizezheng/articulated-animation
Code for Motion Representations for Articulated Animation paper
lizezheng/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
lizezheng/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
lizezheng/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
lizezheng/audioldm_eval
This toolbox aims to unify audio generation model evaluation for easier comparison.
lizezheng/awesome-chatgpt-prompts-zh
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
lizezheng/awesome-digital-human
A collection of resources on digital human including clothed people digitalization, virtual try-on, and other related directions.
lizezheng/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
lizezheng/awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
lizezheng/diff-svc
Singing Voice Conversion via diffusion model
lizezheng/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
lizezheng/first-order-model
This repository contains the source code for the paper First Order Motion Model for Image Animation
lizezheng/layerwise-analysis
Layer-wise analysis of self-supervised pre-trained speech representations
lizezheng/lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
lizezheng/llama.cpp
Port of Facebook's LLaMA model in C/C++
lizezheng/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
lizezheng/PaddleBoBo
基于飞桨开发的虚拟主播
lizezheng/RapidASR
A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide a set of easier APIs to call ASR models.
lizezheng/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
lizezheng/SkyChat-Chinese-Chatbot-GPT3
SkyChat is a Chatbot project based on Chinese GPT3 API. Like chatGPT, it can do human-machine chat, question and answer, and can also complete tasks such as Chinese-English or English-Chinese translation, content continuation, couplets, and Chinese ancient poems writing. / SkyChat是一款基于中文GPT-3 API做的聊天机器人项目。它可以像chatGPT一样,实现人机聊天、问答、中英文互译、对对联、写古诗等任务。
lizezheng/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
lizezheng/StarGANv2-VC
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
lizezheng/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
lizezheng/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
lizezheng/vits_chinese
Best TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Also for voice clone!
lizezheng/WavCaps
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
lizezheng/whisper.cpp
Port of OpenAI's Whisper model in C/C++
lizezheng/youtube-dl
Command-line program to download videos from YouTube.com and other video sites