rohithkodali

Pinned Repositories

3D-convolutional-speaker-recognition
:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
Language:Python0 2 00
aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Language:Python0 2 00
btp-tts
Android app for converting telugu text to speech(TTS).
Language:C1 2 00
correction-text
a program that open a text file and correct some punctuation mistakes
Language:Python1 2 00
Hindi-Spell-Check-Using-Language-Modelling
This project is to provide spell check help from Urdu to Hindi transliteration.The spelling errors in our case mostly comprises of errors in matras.
Language:Python3 4 01
kaldi-active-grammar
Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
Language:Python1 1 00
kaldi-nlp
Kaldi Speech Recognition Toolkit for NLP task
Language:C++1 2 00
pos_blstm
Chinese POS tagger
Language:Python1 2 01
transformer-cnn-emotion-recognition
Speech Emotion Classification with novel Parallel CNN-Transformer model built with PyTorch, plus thorough explanations of CNNs, Transformers, and everything in between
Language:Jupyter Notebook1 1 00

rohithkodali's Repositories

rohithkodali/transformer-cnn-emotion-recognition
Speech Emotion Classification with novel Parallel CNN-Transformer model built with PyTorch, plus thorough explanations of CNNs, Transformers, and everything in between
Language:Jupyter Notebook1 1 00
rohithkodali/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python0 0
rohithkodali/ConsistencyVC-voive-conversion
Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion
Language:Python0 0
rohithkodali/conv-emotion
This repo contains implementation of different architectures for emotion recognition in conversations
rohithkodali/ddsp
DDSP: Differentiable Digital Signal Processing
Language:Python1 0
rohithkodali/Deep-Learning-in-Production
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
1 0
rohithkodali/FastSAM
Fast Segment Anything
Language:Python0 0
rohithkodali/langchain
⚡ Building applications with LLMs through composability ⚡
Language:Python1 0
rohithkodali/langdetect
langauge detection algorithm that can be expandable to add any number of languages
Language:Python2 0
rohithkodali/LookOnceToHear
A novel human-interaction method for real-time speech extraction on headphones.
rohithkodali/melgan-neurips
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
Language:Python1 0
rohithkodali/MLnotebook
Understanding Deep Learning - Simon J.D. Prince
Language:Jupyter Notebook0 0
rohithkodali/Nepali-Ai-Anchor
Nepali AI Anchor Using LSTM & Pix2Pix. [ Itonics Hackathon 2019]
Language:Python1 0
rohithkodali/PhonoQ
PhonoQ is a deep learning model used to compute phonetic-based features related to duration, rate, rhythm*, and goodness of pronunciation* of 18 phonological classes
rohithkodali/pifuhd
High-Resolution 3D Human Digitization from A Single Image.
Language:Python1 0
rohithkodali/proneval
Koel Labs innovates real-time pronunciation feedback for language learners! This repo contains the ML training, evaluation, and data processing code
rohithkodali/Real-time-wake-word-detection
Spoken wake-word detection for conversational avatar
Language:Jupyter Notebook1 0
rohithkodali/recurrent-interface-network-pytorch
Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch
Language:Python1 0
rohithkodali/Resemblyzer
A python package to analyze and compare voices with deep learning
Language:Python1 0
rohithkodali/self-supervised-phone-segmentation
Phoneme segmentation using pre-trained speech models
Language:Python0 0
rohithkodali/StyleTTS
Official Implementation of StyleTTS
Language:Python1 0
rohithkodali/supervoice-dataset
60k hours of phoneme-aligned audio from audio books
Language:Python0 0
rohithkodali/TOI
Toi news
Language:Python2 0
rohithkodali/ULCA-asr-dataset-corpus
1 0
rohithkodali/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E, WIP
Language:Python1 0
rohithkodali/voice-activity-detection
Voice Activity Detection (VAD) using deep learning.
rohithkodali/VoskIdentification
Тестовый пример задействования модели для идентификации голоса с помощью библиотеки распознавания речи "Vosk" (Воск): https://alphacephei.com/vosk/
Language:Java0 0
rohithkodali/Whisper-Hindi-ASR-model-IIT-Bombay-Intership
The Whisper Hindi ASR (Automatic Speech Recognition) model utilizes the KathBath dataset, a comprehensive collection of speech samples in Hindi. Trained on this dataset, Whisper employs advanced deep learning techniques to accurately transcribe spoken Hindi into text.
rohithkodali/whisper-to-normal-speech-conversion
Whisper-to-Normal Speech Conversion Using Generative Adversarial Networks
Language:Python1 0
rohithkodali/you-only-hear-once