Pinned Repositories
AByteOfNLP
some code for nlp tour
AlignmentServer
API for alignment of singing voice to lyrics as used in www.voicemagix.com. Core Machine Learning Algorithms are MLP neural networks and hidden markov models. Based on Django Rest Framework
Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
awesome-music-informatics
A curated list of awesome article, tutorial, library, webpage, etc.
Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
FastImageProcessing
Fast Image Processing with Fully-Convolutional Networks
GPUImage
An open source iOS framework for GPU-based image and video processing
marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
merlin
This is now the official location of the Merlin project.
xzm2004260's Repositories
xzm2004260/NeRViS
Neural Re-rendering for Full-frame Video Stabilization
xzm2004260/ultimateALPR-SDK
World's fastest ANPR / ALPR implementation for CPUs, GPUs, VPUs and FPGAs using deep learning (Tensorflow, Tensorflow lite, TensorRT & OpenVINO). Multi-OS (NVIDIA Jetson, Android, Raspberry Pi, Linux, Windows) and Multi-Arch (ARM, x86).
xzm2004260/FullSubNet
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
xzm2004260/Human-Video-Generation
Human Video Generation Paper List
xzm2004260/TransformerTTS
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
xzm2004260/AudioCodingTutorials
Audio Coding Notebooks and Tutorials
xzm2004260/TaiwaneseTTS
xzm2004260/Speech-Separation-Paper-Tutorial
A must-read paper for speech separation based on neural networks
xzm2004260/Adversarial-Many-to-Many-VC
[InterSpeech 2020] "Improving the Speaker Identity of Non-Parallel Many-to-Many VoiceConversion with Adversarial Speaker Recognition" by Shaojin Ding, Guanlong Zhao, Ricardo Gutierrez-Osuna
xzm2004260/DTLN
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
xzm2004260/EA-SVC
An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"
xzm2004260/distiller
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
xzm2004260/StarGAN-Voice-Conversion-2
A pytorch implementation of StarGAN-VC2
xzm2004260/speech-synthesis-paper
List of speech synthesis papers.
xzm2004260/pytorch-cpp
C++ Implementation of PyTorch Tutorials for Everyone
xzm2004260/CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换)
xzm2004260/control-synthesis
xzm2004260/python_source_separation
「Pythonで学ぶ音源分離」のソースコード
xzm2004260/Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
xzm2004260/SVS_system
A system works on singing voice synthesis
xzm2004260/DE-LIMIT
DeEpLearning models for MultIlingual haTespeech
xzm2004260/audio-pretrained-model
A collection of Audio and Speech pre-trained models.
xzm2004260/haste
Haste: a fast, simple, and open RNN library
xzm2004260/kfr
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
xzm2004260/TTS-frontend
TTS-frontend with Bert and CRF/lstm (For Tacotron)
xzm2004260/crank
Non-parallel Voice Conversion
xzm2004260/score_lyrics_free_svg
Score- and Lyrics-Free Singing Voice Generation
xzm2004260/midiMe
Personalized MusicVAE
xzm2004260/Dancing2Music
xzm2004260/aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.