Pinned Repositories
AD-NeRF
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".
api-ai-english-asr-model
Api.ai English Speech Recognition (ASR) Model for Kaldi
asr-server
FastCGI support for Kaldi ASR
Automatic-Prosody-Annotation
Awesome-Chatbot
Awesome Chatbot Projects,Corpus,Papers,Tutorials.
azure-docs
Open source documentation of Microsoft Azure
bark
🚀 BARK INFINITY 🎶 Power Up The Bark Text-prompted Generative Audio Model
Boost-for-Android
Android port of Boost C++ Libraries
CNTK
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
Cognitive-Speech-SRSample
Community samples to use Cognitive SR services
szhaomsft's Repositories
szhaomsft/Cognitive-Speech-SRSample
Community samples to use Cognitive SR services
szhaomsft/AD-NeRF
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".
szhaomsft/Automatic-Prosody-Annotation
szhaomsft/azure-docs
Open source documentation of Microsoft Azure
szhaomsft/bark
🚀 BARK INFINITY 🎶 Power Up The Bark Text-prompted Generative Audio Model
szhaomsft/Boost-for-Android
Android port of Boost C++ Libraries
szhaomsft/concentus.oggfile
Implementing support for reading/writing .opus audio files using Concentus
szhaomsft/contextualLoss
The Contextual Loss
szhaomsft/corert
This repo contains CoreRT, a .NET Core runtime optimized for AOT (ahead of time compilation) scenarios, with the accompanying compiler toolchain.
szhaomsft/CortanaSkillsKit
Create with the Cortana Skills Kit.
szhaomsft/crepe
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
szhaomsft/dotnet-docs-samples
.NET code samples used on https://cloud.google.com
szhaomsft/LruCacheNet
A fast, generic, thread-safe Least Recently Used (LRU) cache for .NET Standard.
szhaomsft/NAudio
Audio and MIDI library for .NET
szhaomsft/noi
szhaomsft/obfuscar
Open source obfuscation tool for .NET assemblies
szhaomsft/OpenTN
open source text normalizer
szhaomsft/pinyin4net
pinyin4net is a .net library supporting convertion between Chinese characters and Pinyin systems.
szhaomsft/protobuf-android
port protobuf to android
szhaomsft/SadTalker
(CVPR 2023)SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
szhaomsft/stable-diffusion-webui
Stable Diffusion web UI
szhaomsft/tensorflow
Computation using data flow graphs for scalable machine learning
szhaomsft/testdata
szhaomsft/torchcrepe
Pytorch implementation of the CREPE pitch tracker
szhaomsft/tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
szhaomsft/tts-scores
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models
szhaomsft/ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
szhaomsft/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
szhaomsft/voicesmith
[WIP] VoiceSmith makes training text to speech models easy.
szhaomsft/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)