Audio-WestlakeU

Audio Signal and Information Processing Lab at Westlake University

Hangzhou

Pinned Repositories

ATST-SED
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
Language:Jupyter Notebook109 3 2513
audiossl
A library built for easier audio self-supervised training, downstream tasks evaluation
Language:Python110 7 1310
FN-SSL
The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]
Language:Python99 5 910
FS-EEND
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streaming end-to-end neural diarization with online attractor extraction"
Language:Python102 4 144
FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Language:Python558 10 62158
McNet
The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023
Language:Python110 5 813
NBSS
The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation
Language:Python243 7 3629
pytorch_lightning_template_for_beginners
A pytorch template for beginners based on pytorch_lightning
Language:Python37 3 05
RealMAN
A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]
Language:Python108 3 311
RVAE-EM
Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]
Language:Python42 3 44

Audio-WestlakeU's Repositories

Audio-WestlakeU/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Language:Python558 10 62158
Audio-WestlakeU/NBSS
The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation
Language:Python243 7 3629
Audio-WestlakeU/audiossl
A library built for easier audio self-supervised training, downstream tasks evaluation
Language:Python110 7 1310
Audio-WestlakeU/McNet
The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023
Language:Python110 5 813
Audio-WestlakeU/ATST-SED
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
Language:Jupyter Notebook109 3 2513
Audio-WestlakeU/RealMAN
A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]
Language:Python108 3 311
Audio-WestlakeU/FS-EEND
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streaming end-to-end neural diarization with online attractor extraction"
Language:Python102 4 144
Audio-WestlakeU/FN-SSL
The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]
Language:Python99 5 910
Audio-WestlakeU/RVAE-EM
Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]
Language:Python42 3 44
Audio-WestlakeU/pytorch_lightning_template_for_beginners
A pytorch template for beginners based on pytorch_lightning
Language:Python37 3 05
Audio-WestlakeU/SAR-SSL
A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer” [TASLP 2024]
Language:Python32 3 31
Audio-WestlakeU/UMA-ASR
This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).
Language:Shell21 1 15
Audio-WestlakeU/Narrowband_DeepFiltering
Language:Python19 3 06
Audio-WestlakeU/RCT
This repo gives the code for the official implementation of RCT.
Language:Python13 3 01
Audio-WestlakeU/OnlineSSL_DPRTF_EG
Language:MATLAB9 2 05
Audio-WestlakeU/LSTM-noisePSD
Language:Python8 3 02
Audio-WestlakeU/Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement
Language:Python7 0 01
Audio-WestlakeU/bss_ctf_lasso
Language:MATLAB5 2 03
Audio-WestlakeU/Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement-
Language:Python4 1 00
Audio-WestlakeU/Audio-WestlakeU.github.io
Audio and Signal Information Processing Lab in Westlake University concentrates on speech processing algorithm
3 2 01
Audio-WestlakeU/DP_RTF_SSL
Language:MATLAB3 2 13
Audio-WestlakeU/SMIF_online_dereverb
Language:MATLAB3 2 03
Audio-WestlakeU/ATST-RCT
ATST-RCT model for DCASE 2022 task4.
Language:Python2 1 10
Audio-WestlakeU/dereverb_ctf_nonneg
Language:MATLAB1 2 02
Audio-WestlakeU/RS_noisePSD
Language:MATLAB1 2 00
Audio-WestlakeU/RTF_InterFrameSpecSub
Language:MATLAB1 2 03
Audio-WestlakeU/BSS_CTF_EM
Language:MATLAB0 2 01
Audio-WestlakeU/ctf_mint
Language:MATLAB0 2 01