Pinned Repositories
amazon-sagemaker-finetune-deploy-whisper-huggingface
This is a demo project showing how to fine-tune and deploy the Whisper model on SageMaker.
android
《Android Studio开发实战:从零基础到App上线》随书源码(全面添加注释版)
asr-server
FastCGI support for Kaldi ASR
autojs-dingtalk
利用autojs进行钉钉自动打卡的脚本
Beamforming-for-speech-enhancement
simple delaysum, MVDR and CGMM-MVDR
CSOL_TPAMI2021
glide-text2im
GLIDE: a diffusion-based text-conditional image synthesis model
NKF_train
NKF training
speex_aec_kf
speex aec kalman filter
webrtc_tde
webrtc3 aec tde
daihuangyu's Repositories
daihuangyu/speex_aec_kf
speex aec kalman filter
daihuangyu/glide-text2im
GLIDE: a diffusion-based text-conditional image synthesis model
daihuangyu/NKF_train
NKF training
daihuangyu/webrtc_tde
webrtc3 aec tde
daihuangyu/DALL-E
PyTorch package for the discrete VAE used for DALL·E.
daihuangyu/DALLE2-pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
daihuangyu/dsi
Do Something Interesting缩写,做一些有趣的事
daihuangyu/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
daihuangyu/faster-whisper
Faster Whisper transcription with CTranslate2
daihuangyu/FasterTransformer
Transformer related optimization, including BERT, GPT
daihuangyu/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
daihuangyu/GPT2-chitchat
GPT2 for Chinese chitchat/用于中文闲聊的GPT2模型(实现了DialoGPT的MMI**)
daihuangyu/InferLLM
a lightweight LLM model inference framework
daihuangyu/JARVIS
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
daihuangyu/kenlm
KenLM: Faster and Smaller Language Model Queries
daihuangyu/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence BLip
daihuangyu/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
daihuangyu/so-vits-svc
SoftVC VITS Singing Voice Conversion
daihuangyu/stable-diffusion-webui
Stable Diffusion web UI
daihuangyu/street-fighter-ai
This is an AI agent for Street Fighter II Champion Edition.
daihuangyu/swift
Use PEFT or Full-parameter to fine-tuning LLMs or MLLMs
daihuangyu/SwissArmyTransformer
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
daihuangyu/TTS-1
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
daihuangyu/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Can be trained on a single GPU!
daihuangyu/VITS-fast-fine-tuning
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
daihuangyu/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
daihuangyu/Whisper-1
High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
daihuangyu/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
daihuangyu/ZJU-nCov-Hitcarder-1
daihuangyu/ZJU-nCov-Hitcarder-Sample
Sample of https://github.com/Long0x0/ZJU-nCov-Hitcarder.