folalafish
My name is Zdninger, a student in school of Control Science and Engineering, Shandong University, majoring in software engineering, engirs coding.
SDUShandong University Qianfoshan Campus, No. 17923, Jingshi Road, Lixia District, Jinan City, Shandong Province, China
folalafish's Stars
david-gimeno/tailored-avsr
Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
mrjunjieli/LRS3_for_AVSS
smeetrs/deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
krahets/hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
songquanpeng/pytorch-template
To be the world's best PyTorch project template.
TaoRuijie/TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
LetterLiGo/SafeEar
[ACM CCS'24] SafeEar: Content Privacy-Preserving Audio Deepfake Detection
ziyangwang007/VMambaMorph
VMambaMorph: a Multi-Modality Deformable Image Registration Framework based on Visual State Space Model with Cross-Scan Module
myscience/x-lstm
Pytorch implementation of the xLSTM model by Beck et al. (2024)
SarthakYadav/axlstm-official
Official repository for the paper "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs"
NX-AI/xlstm
Official repository of the xLSTM.
XiudingCai/Awesome-Mamba-Collection
A curated collection of papers, tutorials, videos, and other valuable resources related to Mamba.
vuthede/speech_separation_PIT
The Simple project to separate mixed voice. Using "Permutation Invariant Training Loss" and "PairWise Neg SisDr Loss"
aidanmomo/Speech-Enhancement-Metrics-SNR-SDRi-SISDRi
aliutkus/speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
qjcg/awesome-typst
Awesome Typst Links
Beilong-Tang/TSELM
Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models
ebrunet28/MultiDecoder-DPRNN
JusperLee/CTCNet
An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
JusperLee/SPMamba
IsaacRodgz/GMU-Baseline
Replication of models and results obtained in "Gated multimodal networks" paper
JusperLee/LRS3-For-Speech-Separation
Multi-modal speech separation task data generation script on LRS3 data set.
AudioLLMs/AudioBench
AudioBench: A Universal Benchmark for Audio Large Language Models
null-yer/HuggingFace-Download-Accelerator
国内加速,可视化批量下载huggingFace文件
personqianduixue/Math_Model
数学建模、美赛、美国大学生数学建模竞赛、全国大学生数学建模竞赛、华为杯研究生数学建模、国赛LaTeX模板、美赛LaTeX模板、mathorcup、电工杯、华中赛、APMCM、深圳杯、中青杯、华东杯、数维杯、东三省数学建模、认证杯、数学建模书籍、常用matlab算法、国赛评阅要点、软件模型算法汇总、智能算法、优化算法、现代的算法
Andong-Li-speech/Neural-Vocoders-as-Speech-Enhancers
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
w1018979952/DSANet
Looking and Hearing into Details: Dual-enhanced Siamese Adversarial Network for Audio-Visual Matching