Pinned Repositories
3D-convolutional-speaker-recognition
:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
ae-wavenet
Wavenet Autoencoder for Unsupervised speech representation learning (after Chorowski, Jan 2019)
akshare
AkShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
ASR
athena-signal
audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
audio_to_midi_melodia
Extract the melody from an audio file and export to MIDI
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
audiowmark
Audio Watermarking
awesome-deep-learning-music
List of articles related to deep learning applied to music
lbxcfx's Repositories
lbxcfx/ae-wavenet
Wavenet Autoencoder for Unsupervised speech representation learning (after Chorowski, Jan 2019)
lbxcfx/awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
lbxcfx/Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
lbxcfx/deep-speaker
Deep Speaker: an End-to-End Neural Speaker Embedding System.
lbxcfx/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
lbxcfx/DNS-Challenge
This repo contains the scripts, models and required files for the Interspeech 2020 Deep Noise Suppression (DNS) Challenge. We are open sourcing clean speech and noise files as well. Participants of this challenge will use the scripts from this repo to create data to train their noise suppressors. They will compare their method with our baseline noise suppressor and report the results.
lbxcfx/espnet
End-to-End Speech Processing Toolkit
lbxcfx/Face2FaceTranslator
面对面翻译小程序是微信团队针对面对面沟通的场景开发的流式语音翻译小程序,通过微信同声传译插件提供了语音识别,文本翻译等功能。
lbxcfx/gated-graph-neural-network-samples
Sample Code for Gated Graph Neural Networks
lbxcfx/gpuRIR
Python library for Room Impulse Response (RIR) simulation with GPU acceleration
lbxcfx/lip-reading-deeplearning
:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
lbxcfx/magenta
Magenta: Music and Art Generation with Machine Intelligence
lbxcfx/make-a-smart-speaker
A collection of resources to make a smart speaker
lbxcfx/MASS
MASS: Masked Sequence to Sequence Pre-training for Language Generation
lbxcfx/meep
free finite-difference time-domain (FDTD) software for electromagnetic simulations
lbxcfx/nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.
lbxcfx/pase
Problem Agnostic Speech Encoder
lbxcfx/porcupine
On-device wake word detection powered by deep learning.
lbxcfx/pyAudioAnalysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
lbxcfx/RawNet
Reproducing RawNet paper with Keras and additional experiments with PyTorch.
lbxcfx/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
lbxcfx/Resemblyzer
A python package to analyze and compare voices with deep learning
lbxcfx/rsrgan
Robust Speech Recognition Using Generative Adversarial Networks (GAN)
lbxcfx/segan_pytorch
Speech Enhancement Generative Adversarial Network in PyTorch
lbxcfx/sha-rnn
Single Headed Attention RNN - "Stop thinking with your head"
lbxcfx/Tacotron-2-Chinese
中文语音合成,改自 https://github.com/Rayhane-mamah/Tacotron-2 和 https://github.com/begeekmyfriend/Tacotron-2
lbxcfx/TensorflowTTS
:stuck_out_tongue_closed_eyes: TensorflowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2
lbxcfx/transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
lbxcfx/vcc20_baseline_cyclevae
Voice Conversion Challenge 2020 CycleVAE baseline system
lbxcfx/VGG-Speaker-Recognition
Utterance-level Aggregation For Speaker Recognition In The Wild