ruclion's Stars
tensorflow/tensorflow
An Open Source Machine Learning Framework for Everyone
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
wiseman/py-webrtcvad
Python interface to the WebRTC Voice Activity Detector
rogersce/cnpy
library to read/write .npy and .npz files in C/C++
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
joyycom/VNN
VNN是由欢聚集团(Joyy Inc.)推出的高性能、轻量级神经网络部署框架。目前已为Hago、VOO、VFly、马克相机等App提供20余种AI能力的支持,覆盖直播、短视频、视频编辑等泛娱乐场景和工程场景
jtkim-kaist/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
serizba/cppflow
Run TensorFlow models in C++ without installation and without Bazel
yatengLG/Focal-Loss-Pytorch
全中文注释.(The loss function of retinanet based on pytorch).(You can use it on one-stage detection task or classifical task, to solve data imbalance influence).用于one-stage目标检测算法,提升检测效果.你也可以在分类任务中使用该损失函数,解决数据不平衡问题.
jim-schwoebel/voicebook
🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
stefanrmmr/streamlit-audio-recorder
Record Audio from the User's Microphone in Apps that are Deployed to the Web. (via Browser Media-API, REACT-based, Streamlit Custom Component)
nttcslab-sp/kaldiio
A pure python module for reading and writing kaldi ark files
jtkim-kaist/Speech-enhancement
Deep neural network based speech enhancement toolkit
nryant/dscore
Diarization scoring tools.
RicherMans/GPV
Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
YuanGongND/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
Daisyqk/Automatic-Prosody-Annotation
qiuqiangkong/panns_transfer_to_gtzan
RicherMans/Datadriven-GPVAD
The codebase for Data-driven general-purpose voice activity detection.
nladuo/AI_beatmap_generator
尝试使用神经网络生成音乐游戏Malody的谱面。
jtkim-kaist/ram_modified
"Recurrent Models of Visual Attention" in TensorFlow
iiscleap/DIHARD_2019_baseline_alltracks
jim-schwoebel/sound_event_detection
🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.
lrfasd/lrfasd.github.io
srvk/DiViMe
ACLEW Diarization Virtual Machine
usc-sail/mica-speech-activity-detection
Robust Speech Activity Detection (SAD) in movie audio
hbredin/DomainAdversarialVoiceActivityDetection
Code for reproducing experiments in "Domain-Adversarial Voice Activity Detection"
zhaoyi2/audio_augment
A tool/script for batch speech data enhancement with speed/volume/RIRS/MUSAN
shamim-hussain/musan_investigation_cnn_rnn
Evaluation of the classification performance (Speech, Music, and Noise) of 1D (WaveNet) and 2D (MobileNet) CNN and RNN (GRU) on the MUSAN corpus.
tjdgns0928/MultiTarget_VAD
Representation of Paper: On training targets for noise-robust voice activity detection.