ruclion

喜欢威斯布鲁克和米切尔=.=

Tsinghua University深圳

ruclion's Stars

tensorflow/tensorflow
An Open Source Machine Learning Framework for Everyone
Language:C++183k 7.6k 39.3k74k
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Language:Python3k 40 185328
wiseman/py-webrtcvad
Python interface to the WebRTC Voice Activity Detector
Language:C1.9k 48 81398
rogersce/cnpy
library to read/write .npy and .npz files in C/C++
Language:C++1.3k 29 64293
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
Language:Jupyter Notebook1k 19 129200
joyycom/VNN
VNN是由欢聚集团(Joyy Inc.)推出的高性能、轻量级神经网络部署框架。目前已为Hago、VOO、VFly、马克相机等App提供20余种AI能力的支持，覆盖直播、短视频、视频编辑等泛娱乐场景和工程场景
Language:C966 30 33197
jtkim-kaist/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Language:MATLAB826 45 40229
serizba/cppflow
Run TensorFlow models in C++ without installation and without Bazel
Language:C++767 26 189176
yatengLG/Focal-Loss-Pytorch
全中文注释.(The loss function of retinanet based on pytorch).(You can use it on one-stage detection task or classifical task, to solve data imbalance influence).用于one-stage目标检测算法,提升检测效果.你也可以在分类任务中使用该损失函数,解决数据不平衡问题.
Language:Jupyter Notebook426 5 19114
jim-schwoebel/voicebook
🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
Language:Python370 25 2582
stefanrmmr/streamlit-audio-recorder
Record Audio from the User's Microphone in Apps that are Deployed to the Web. (via Browser Media-API, REACT-based, Streamlit Custom Component)
Language:TypeScript357 1 1969
nttcslab-sp/kaldiio
A pure python module for reading and writing kaldi ark files
Language:Python244 12 1635
jtkim-kaist/Speech-enhancement
Deep neural network based speech enhancement toolkit
Language:MATLAB210 8 2863
nryant/dscore
Diarization scoring tools.
Language:Python198 8 440
RicherMans/GPV
Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
Language:Python140 5 929
YuanGongND/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
Language:Python129 1 1216
Daisyqk/Automatic-Prosody-Annotation
Language:Python109 3 547
qiuqiangkong/panns_transfer_to_gtzan
Language:Python95 2 1138
RicherMans/Datadriven-GPVAD
The codebase for Data-driven general-purpose voice activity detection.
Language:Python90 8 1623
nladuo/AI_beatmap_generator
尝试使用神经网络生成音乐游戏Malody的谱面。
Language:Jupyter Notebook43 2 110
jtkim-kaist/ram_modified
"Recurrent Models of Visual Attention" in TensorFlow
Language:Python42 6 29
iiscleap/DIHARD_2019_baseline_alltracks
Language:Perl37 1 112
jim-schwoebel/sound_event_detection
🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.
Language:Python36 1 13
lrfasd/lrfasd.github.io
Language:HTML36 6 1015
srvk/DiViMe
ACLEW Diarization Virtual Machine
Language:Shell30 13 1529
usc-sail/mica-speech-activity-detection
Robust Speech Activity Detection (SAD) in movie audio
Language:Python25 21 610
hbredin/DomainAdversarialVoiceActivityDetection
Code for reproducing experiments in "Domain-Adversarial Voice Activity Detection"
Language:Jupyter Notebook23 4 24
zhaoyi2/audio_augment
A tool/script for batch speech data enhancement with speed/volume/RIRS/MUSAN
Language:Shell20 1 14
shamim-hussain/musan_investigation_cnn_rnn
Evaluation of the classification performance (Speech, Music, and Noise) of 1D (WaveNet) and 2D (MobileNet) CNN and RNN (GRU) on the MUSAN corpus.
Language:Python14 3 210
tjdgns0928/MultiTarget_VAD
Representation of Paper: On training targets for noise-robust voice activity detection.
Language:Jupyter Notebook4 1 12

ruclion

ruclion's Stars

tensorflow/tensorflow

snakers4/silero-vad

wiseman/py-webrtcvad

rogersce/cnpy

YuanGongND/ast

joyycom/VNN

jtkim-kaist/VAD

serizba/cppflow

yatengLG/Focal-Loss-Pytorch

jim-schwoebel/voicebook

stefanrmmr/streamlit-audio-recorder

nttcslab-sp/kaldiio

jtkim-kaist/Speech-enhancement

nryant/dscore

RicherMans/GPV

YuanGongND/psla

Daisyqk/Automatic-Prosody-Annotation

qiuqiangkong/panns_transfer_to_gtzan

RicherMans/Datadriven-GPVAD

nladuo/AI_beatmap_generator

jtkim-kaist/ram_modified

iiscleap/DIHARD_2019_baseline_alltracks

jim-schwoebel/sound_event_detection

lrfasd/lrfasd.github.io

srvk/DiViMe

usc-sail/mica-speech-activity-detection

hbredin/DomainAdversarialVoiceActivityDetection

zhaoyi2/audio_augment

shamim-hussain/musan_investigation_cnn_rnn

tjdgns0928/MultiTarget_VAD