Pinned Repositories
AByteOfNLP
some code for nlp tour
AlignmentServer
API for alignment of singing voice to lyrics as used in www.voicemagix.com. Core Machine Learning Algorithms are MLP neural networks and hidden markov models. Based on Django Rest Framework
Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
awesome-music-informatics
A curated list of awesome article, tutorial, library, webpage, etc.
Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
FastImageProcessing
Fast Image Processing with Fully-Convolutional Networks
GPUImage
An open source iOS framework for GPU-based image and video processing
marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
merlin
This is now the official location of the Merlin project.
xzm2004260's Repositories
xzm2004260/LINNE
(Beta) LInear-predictive Neural Net Encoder -- A lossless audio codec
xzm2004260/course
高性能并行编程与优化 - 课件
xzm2004260/survey
A Survey on Neural Speech Synthesis https://arxiv.org/pdf/2106.15561.pdf
xzm2004260/ai-research-code
xzm2004260/tensorRT_Pro
C++ library based on tensorrt integration
xzm2004260/hifi_gan_c
A purely header only c version of hifi-gan
xzm2004260/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
xzm2004260/pytorch-vector-quantization
A Pytorch Implementations for Various Vector Quantization Methods
xzm2004260/madmom
Python audio and music signal processing library
xzm2004260/annotated_deep_learning_paper_implementations
🧑🏫 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit), optimizers (adam, radam, adabelief), gans(dcgan, cyclegan, stylegan2), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, etc. 🧠
xzm2004260/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
xzm2004260/larynx
End to end text to speech system using gruut and onnx
xzm2004260/MLPSinger
xzm2004260/pnf-sampling
xzm2004260/SpeechAlgorithms
Speech Algorithms Collections
xzm2004260/multilingual_VQVAE
xzm2004260/av-se
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
xzm2004260/assem-vc
Official Repository for Assem-VC @ INTERSPEECH 2021 SUBMITTED
xzm2004260/ReSampler
High quality command-line audio sample rate converter
xzm2004260/mp3net
A convolutional generative audio synthesis model
xzm2004260/OpenCC
Conversion between Traditional and Simplified Chinese
xzm2004260/Deep-Learning-for-Audio-Super-Resolution
This is my master's degree thesis project in Data Science.
xzm2004260/malaya-speech
Speech Toolkit for bahasa Malaysia, https://malaya-speech.readthedocs.io/
xzm2004260/Forward
a library for high performance deep learning inference on NVIDIA GPUs.
xzm2004260/TTS-1
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
xzm2004260/DNN-HSMM
pytorch implementation of DNN-HSMM for TTS
xzm2004260/traditional-speech-enhancement
语音增强传统方法
xzm2004260/mandarin-tts
Mandarin text-to-speech 中文语音合成(TTS), based on Fastspeech2
xzm2004260/NeRViS
Neural Re-rendering for Full-frame Video Stabilization
xzm2004260/ultimateALPR-SDK
World's fastest ANPR / ALPR implementation for CPUs, GPUs, VPUs and FPGAs using deep learning (Tensorflow, Tensorflow lite, TensorRT & OpenVINO). Multi-OS (NVIDIA Jetson, Android, Raspberry Pi, Linux, Windows) and Multi-Arch (ARM, x86).