banalasaritha
Speaker Recognition and Identification, Meta-learning, Few Shot Learning & Speech Processing, Speech-activity-detection , T-F Representations.
IndiaNational Institute of Technology
Pinned Repositories
3DCNN
3D convolutional neural network for video classification
AFRNN
ClusteringDirectionCentrality
A novel Clustering algorithm by measuring Direction Centrality (CDC) locally. It adopts a density-independent metric based on the distribution of K-nearest neighbors (KNNs) to distinguish between internal and boundary points. The boundary points generate enclosed cages to bind the connections of internal points.
Hands-On-Meta-Learning-With-Python
Learning to Learn using One-Shot Learning, MAML, Reptile, Meta-SGD and more with Tensorflow
MAML-and-FOMAML-implimentaion-and-comparison
Comparison between MAML & FOMAML
MetaAudio-A-Few-Shot-Audio-Classification-Benchmark
A new comprehensive and diverse few-shot acoustic classification benchmark.
prototypical-networks-tensorflow
Tensorflow implementation of NIPS 2017 Paper "Prototypical Networks for Few-shot Learning"
reptile-pytorch
A PyTorch implementation of OpenAI's REPTILE algorithm
Self-Supervised-Audio-Spectrogram-Transformer
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
workshops
Materials for workshops on the Hugging Face ecosystem
banalasaritha's Repositories
banalasaritha/ClusteringDirectionCentrality
A novel Clustering algorithm by measuring Direction Centrality (CDC) locally. It adopts a density-independent metric based on the distribution of K-nearest neighbors (KNNs) to distinguish between internal and boundary points. The boundary points generate enclosed cages to bind the connections of internal points.
banalasaritha/MAML-and-FOMAML-implimentaion-and-comparison
Comparison between MAML & FOMAML
banalasaritha/workshops
Materials for workshops on the Hugging Face ecosystem
banalasaritha/AFRNN
banalasaritha/AISHELL-2
kaldi-asr/kaldi is the official location of the Kaldi project.
banalasaritha/AS-pVAD
AS-pVAD: A Real-time Personalized Voice Activity Detection Network With Attentive Score Loss
banalasaritha/Audio-Spectrogram-Transformer
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
banalasaritha/AudioSet-For-Meta-Learning
Meta-Learning for Few Shot Learning
banalasaritha/CACRN-Net
Channel Attention Convolutional Recurrent Neural Network for Few-Shot Speaker Identification
banalasaritha/Chinese-Speaker-Identification
End-to-End Chinese Speaker Identification
banalasaritha/DO-Conv
Depthwise Over-parameterized Convolutional Layer
banalasaritha/FastVADCode
Code for FastVad
banalasaritha/FunASR-Transformer-VAD
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.
banalasaritha/image_caption_with_selfAttention
banalasaritha/Meta-TTS
Official repository of https://doi.org/10.1109/TASLP.2022.3167258. More up-to-date code is in "refactor" branch.
banalasaritha/MultiresolutionNeuralNetworks
Multi-Resolution Neural Networks
banalasaritha/prototypical-networks-jupyter
Prototypical-networks few shot learning
banalasaritha/prototypical_networks_pytorch
banalasaritha/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
banalasaritha/python-wigner-distribution
A python based Wigner distribution including a method for interference reduction
banalasaritha/pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
banalasaritha/ResNeSt
ResNeSt: Split-Attention Networks
banalasaritha/Speaker-emotion-speech-and-diarazation-recognition
banalasaritha/speaker-identification-1
Speaker Identification using Neural Net.
banalasaritha/speakerbox
Speakerbox: Fine-tune Audio Transformers for speaker identification.
banalasaritha/Time-Frequency-Representations-of-RSR2015-Database
Time frequency representations of RSR 2015 database.
banalasaritha/transformers-wav2vec
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
banalasaritha/vggvox_identification
Training and evaluation of VGGVox neural network for speaker identification
banalasaritha/voice_activity_detection
Voice Activity Detection based on Deep Learning & TensorFlow
banalasaritha/Voice_Activity_Detection_Frame
Frame-VAD: More Effective and Efficient VAD for More Fine-grained Timestamps