xzm2004260

speech synthesis , TTS

Xiamen

Pinned Repositories

AByteOfNLP
some code for nlp tour
Language:Python0 2 00
AlignmentServer
API for alignment of singing voice to lyrics as used in www.voicemagix.com. Core Machine Learning Algorithms are MLP neural networks and hidden markov models. Based on Django Rest Framework
Language:Python1 2 00
Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Language:Python0 2 00
awesome-music-informatics
A curated list of awesome article, tutorial, library, webpage, etc.
1 1 00
Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
Language:Python1 0 00
DL-AFx
Deep Learning for Black-Box Modeling of Audio Effects - website:
Language:Python12
FastImageProcessing
Fast Image Processing with Fully-Convolutional Networks
Language:Python1 2 00
GPUImage
An open source iOS framework for GPU-based image and video processing
Language:Objective-C0 2 00
marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
Language:Java0 2 00
merlin
This is now the official location of the Merlin project.
Language:Python0 2 00

xzm2004260's Repositories

xzm2004260/Adversarial-Many-to-Many-VC
[InterSpeech 2020] "Improving the Speaker Identity of Non-Parallel Many-to-Many VoiceConversion with Adversarial Speaker Recognition" by Shaojin Ding, Guanlong Zhao, Ricardo Gutierrez-Osuna
Language:Python1 0
xzm2004260/audio-pretrained-model
A collection of Audio and Speech pre-trained models.
1 0
xzm2004260/AudioCodingTutorials
Audio Coding Notebooks and Tutorials
Language:Jupyter Notebook1 0
xzm2004260/control-synthesis
Language:Python1 0
xzm2004260/Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Language:C++1 0
xzm2004260/CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换)
Language:Python1 0
xzm2004260/DE-LIMIT
DeEpLearning models for MultIlingual haTespeech
Language:Jupyter Notebook1 0
xzm2004260/distiller
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Language:Jupyter Notebook1 0
xzm2004260/DNN-HSMM
pytorch implementation of DNN-HSMM for TTS
xzm2004260/DTLN
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
Language:Python0 0
xzm2004260/EA-SVC
An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"
Language:Python1 0
xzm2004260/Forward
a library for high performance deep learning inference on NVIDIA GPUs.
xzm2004260/FullSubNet
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Language:Python1 0
xzm2004260/haste
Haste: a fast, simple, and open RNN library
Language:C++1 0
xzm2004260/Human-Video-Generation
Human Video Generation Paper List
1 0
xzm2004260/kfr
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
xzm2004260/mandarin-tts
Mandarin text-to-speech 中文语音合成(TTS), based on Fastspeech2
xzm2004260/NeRViS
Neural Re-rendering for Full-frame Video Stabilization
Language:Python1 0
xzm2004260/python_source_separation
「Pythonで学ぶ音源分離」のソースコード
Language:Python1 0
xzm2004260/pytorch-cpp
C++ Implementation of PyTorch Tutorials for Everyone
Language:C++1 0
xzm2004260/Speech-Separation-Paper-Tutorial
A must-read paper for speech separation based on neural networks
1 0
xzm2004260/speech-synthesis-paper
List of speech synthesis papers.
xzm2004260/StarGAN-Voice-Conversion-2
A pytorch implementation of StarGAN-VC2
Language:Python1 0
xzm2004260/SVS_system
A system works on singing voice synthesis
Language:Python1 0
xzm2004260/TaiwaneseTTS
Language:Python1 0
xzm2004260/traditional-speech-enhancement
语音增强传统方法
xzm2004260/TransformerTTS
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
Language:Python1 0
xzm2004260/TTS-1
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
xzm2004260/TTS-frontend
TTS-frontend with Bert and CRF/lstm (For Tacotron)
Language:Python1 0
xzm2004260/ultimateALPR-SDK
World's fastest ANPR / ALPR implementation for CPUs, GPUs, VPUs and FPGAs using deep learning (Tensorflow, Tensorflow lite, TensorRT & OpenVINO). Multi-OS (NVIDIA Jetson, Android, Raspberry Pi, Linux, Windows) and Multi-Arch (ARM, x86).
Language:C++1 0