Pinned Repositories
AByteOfNLP
some code for nlp tour
AlignmentServer
API for alignment of singing voice to lyrics as used in www.voicemagix.com. Core Machine Learning Algorithms are MLP neural networks and hidden markov models. Based on Django Rest Framework
Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
awesome-music-informatics
A curated list of awesome article, tutorial, library, webpage, etc.
Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
DL-AFx
Deep Learning for Black-Box Modeling of Audio Effects - website:
FastImageProcessing
Fast Image Processing with Fully-Convolutional Networks
GPUImage
An open source iOS framework for GPU-based image and video processing
marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
merlin
This is now the official location of the Merlin project.
xzm2004260's Repositories
xzm2004260/Adversarial-Many-to-Many-VC
[InterSpeech 2020] "Improving the Speaker Identity of Non-Parallel Many-to-Many VoiceConversion with Adversarial Speaker Recognition" by Shaojin Ding, Guanlong Zhao, Ricardo Gutierrez-Osuna
xzm2004260/audio-pretrained-model
A collection of Audio and Speech pre-trained models.
xzm2004260/AudioCodingTutorials
Audio Coding Notebooks and Tutorials
xzm2004260/control-synthesis
xzm2004260/Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
xzm2004260/CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换)
xzm2004260/DE-LIMIT
DeEpLearning models for MultIlingual haTespeech
xzm2004260/distiller
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
xzm2004260/DNN-HSMM
pytorch implementation of DNN-HSMM for TTS
xzm2004260/DTLN
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
xzm2004260/EA-SVC
An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"
xzm2004260/Forward
a library for high performance deep learning inference on NVIDIA GPUs.
xzm2004260/FullSubNet
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
xzm2004260/haste
Haste: a fast, simple, and open RNN library
xzm2004260/Human-Video-Generation
Human Video Generation Paper List
xzm2004260/kfr
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
xzm2004260/mandarin-tts
Mandarin text-to-speech 中文语音合成(TTS), based on Fastspeech2
xzm2004260/NeRViS
Neural Re-rendering for Full-frame Video Stabilization
xzm2004260/python_source_separation
「Pythonで学ぶ音源分離」のソースコード
xzm2004260/pytorch-cpp
C++ Implementation of PyTorch Tutorials for Everyone
xzm2004260/Speech-Separation-Paper-Tutorial
A must-read paper for speech separation based on neural networks
xzm2004260/speech-synthesis-paper
List of speech synthesis papers.
xzm2004260/StarGAN-Voice-Conversion-2
A pytorch implementation of StarGAN-VC2
xzm2004260/SVS_system
A system works on singing voice synthesis
xzm2004260/TaiwaneseTTS
xzm2004260/traditional-speech-enhancement
语音增强传统方法
xzm2004260/TransformerTTS
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
xzm2004260/TTS-1
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
xzm2004260/TTS-frontend
TTS-frontend with Bert and CRF/lstm (For Tacotron)
xzm2004260/ultimateALPR-SDK
World's fastest ANPR / ALPR implementation for CPUs, GPUs, VPUs and FPGAs using deep learning (Tensorflow, Tensorflow lite, TensorRT & OpenVINO). Multi-OS (NVIDIA Jetson, Android, Raspberry Pi, Linux, Windows) and Multi-Arch (ARM, x86).