MandarGogate

MandarGogate's Stars

JuanFMontesinos/VoViT
VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer
Language:Python338
binwiederhier/ntfy
Send push notifications to your phone or desktop using PUT/POST
Language:Go17.3k665
TeaPoly/PLCPA-ASYM-Loss
The power-law compressed phase-aware asymmetric (PLCPA-ASYM) loss
Language:Python91
ml-explore/mlx-swift-examples
Examples using MLX Swift
Language:Swift46272
JusperLee/TDANet
An efficient speech separation method
Language:Python21727
keunwoochoi/kapre
kapre: Keras Audio Preprocessors
Language:Python918148
KinWaiCheuk/nnAudio
Audio processing by using pytorch 1D convolution network
Language:Python98790
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Language:Python2.9k169
mosaicml/composer
Supercharge Your Model Training
Language:Python5.1k408
jonashaag/pydct
Short-Time Discrete Cosine Transform (DCT) for Python. SciPy, TensorFlow and PyTorch implementations.
Language:Jupyter Notebook275
sungwon23/BSRNN
Language:Python7112
facebookresearch/facestar
Facestar dataset. High quality audio-visual recordings of human conversational speech.
Language:Python996
facebook/docusaurus
Easy to maintain open source documentation websites.
Language:TypeScript54.4k8.2k
microsoft/P.808
This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR).
Language:HTML20058
reflex-dev/reflex
🕸️ Web apps in pure Python 🐍
Language:Python18.1k1k
mamba-org/mamba
The Fast Cross-Platform Package Manager
Language:C++6.6k345
microsoft/DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
Language:Python1k404
Jungjee/RawNet
Official repository for RawNet, RawNet2, and RawNet3
Language:Python34956
churichard/notabase
A second brain for your knowledge, thoughts, and ideas.
Language:TypeScript74170
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Python64.7k7.6k
cogmhear/avse_challenge
COG-MHEAR Audio-Visual Speech Enhancement Challenge
Language:Python299
wichmann-lab/python-psignifit
Python clone of psignifit providing basic functionality
Language:Python5423
huyanxin/DeepComplexCRN
Language:HTML38798
Rudrabha/Lip2Wav
This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"
Language:Python691152
cogmhear/Intelligibility-Oriented-Audio-Visual-Speech-Enhancement
Towards Intelligibility-Oriented Audio-Visual Speech Enhancement
Language:Python132
microsoft/Deep3DFaceReconstruction
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)
Language:Python2.1k438
facebookresearch/denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
Language:Python1.6k299
apache/mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Language:C++20.7k6.8k
mlflow/mlflow
Open source platform for the machine learning lifecycle
Language:Python18k4.1k
lutzroeder/netron
Visualizer for neural network, deep learning and machine learning models
Language:JavaScript26.9k2.7k