IMYBo's Stars
LC044/WeChatMsg
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
01-ai/Yi
A series of large language models trained from scratch by developers @01-ai
state-spaces/s4
Structured state space sequence models
wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
vivo-ai-lab/BlueLM
BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab
microsoft/MS-SNSD
The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired.
subhadarship/kmeans_pytorch
kmeans using PyTorch
metame-ai/awesome-audio-plaza
Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation
BUTSpeechFIT/VBx
Variational Bayes HMM over x-vectors diarization
rishikksh20/hifigan-denoiser
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
DongKeon/Awesome-Speaker-Diarization
Some comprehensive papers about speaker diarization
slp-rl/aero
This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)
pythad/nider
Python package to add text to images, textures and different backgrounds
marianne-m/brouhaha-vad
Predicts the level of noise and reverberation on your audiofiles
facebookresearch/ears_dataset
Expressive Anechoic Recordings of Speech (EARS)
haoheliu/SemantiCodec-inference
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
f-dangel/unfoldNd
(N=1,2,3)-dimensional unfold (im2col) and fold (col2im) in PyTorch
desh2608/dover-lap
Python package for combining diarization system outputs.
liyunlongaaa/NSD-MS2S
CHIME-7/8 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence architecture
yuguochencuc/BAE-Net
BAE-NET: A LOW COMPLEXITY AND HIGH FIDELITY BANDWIDTH-ADAPTIVE NEURAL NETWORK FOR SPEECH SUPER-RESOLUTION
AkojimaSLP/Frame-by-frame-closed-form-update-for-mask-based-adaptive-MVDR-beamforming
speech-enhacement
BUTSpeechFIT/EEND_dataprep
dmlguq456/NeXt_TDNN_ASV
Official repository of NeXt-TDNN for speaker verification
Kuray107/S4ND-U-Net_speech_enhancement
JusperLee/S4M
Official implementation of Efficient Speech Separation Framework Based on Neural State-Space Models
wq2012/VB_diarization
VB Diarization with Eigenvoice and HMM Priors, refactored