Sangramsingkayte
My ultimate goal is to do work that I like to do and my organisation can provide me that opportunity.
A. P. Moller - MaerskCopenhagen - Denmark
Pinned Repositories
Audio-Feature-Extraction
In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an MFC.
End-to-End-Neural-Diarization
Matlab-Voice-Record-and-plot-FFT-Real-Time
Gammatone-like-spectrograms
Gammatone filters are a popular linear approximation to the filtering performed by the ear. This routine provides a simple wrapper for generating time-frequency surfaces based on a gammatone analysis, which can be used as a replacement for a conventional spectrogram. It also provides a fast approximation to this surface based on weighting the output of a conventional FFT.
GAN_based_TTS
The GAN is a very powerful technique works on the functionality of generator G and discriminator D based on game theory. This involves the network of the generator which maps and estimate the input features of the samples The other one is discriminator that tries to find the closest match for the generated sample to that of the original sample and identifies the dissimilarities between the two. So we can say that the generator is described to fool the discriminator. The Generator generates the linguistic features of the given text and discriminator optimizes the original feature vector and generated the feature vector
HTK-features-in-Python
HTK features in Python This project contains a Python implementation of the MFCC features as computed by HTK.
Image-Caption-using-CNNs-and-RNNs-
Image Caption Generator using CNNs and RNNs¶
Speech
Speech-Synthesis-System
Language is the structural form of sharing thoughts and emotions in humans. The research motivates to stroke up for the Human-computer interaction. The overall intention of my PhD research program is focused to design Concatenation and Hidden Markov Model (HMM) based speech synthesis for the Marathi language. This will facilitate to correspond to the system and extend the technology for assertive devices based on the Marathi language. The advantage and attractive feature of the HMM system are that the voice alteration can be performed without large databases. To understand the detailed study of Synthesis techniques, I have also implemented the system for Unit Selection method. The Marathi Talking calculator is published at Play store using the technique of concatenation. This calculator performs the basic arithmetic operations and additionally speaks out the numeral in Marathi as the key is pressed. The result box synthesis the voice and speaks out the result in Marathi with correct place value of digits. The weakness of USS is it requires a large database and at joins, the quality is affected. To overcome these issues, the study reveals the built-up of a system with a phonetic based approach for Marathi Language using Concatenation and HMM.
Stroke-Prediction
Machine Learning is the fastest-growing technique in many fields and the healthcare industry is no exception to this. Machine Learning algorithms plays an essential role in predicting the presence/absence of Heart diseases, tumors, and more. Such required information, if predicted well in advance, can provide important insights to doctors who can then adapt their diagnosis and treat the patient accordingly. World Health Organization has estimated 12 million deaths occur worldwide, every year due to heart diseases. Half the deaths in the United States and other developed countries are due to cardiovascular diseases. The early prognosis of stroke diseases can aid in making decisions on lifestyle changes in high-risk patients and in turn reduce the complications. If it is about to identify the relationship and factors affecting it can cured n advance time. This research intends to pinpoint the most relevant/risk factors of heart disease as well as predict the overall risk using logistic regression. In this report, I'll discuss the prediction of stroke using Machine Learning algorithms. The algorithm I have implemented is logistic regression on the Health
VBx
Variational Bayes HMM over x-vectors diarization
Sangramsingkayte's Repositories
Sangramsingkayte/Speech
Sangramsingkayte/GAN_based_TTS
The GAN is a very powerful technique works on the functionality of generator G and discriminator D based on game theory. This involves the network of the generator which maps and estimate the input features of the samples The other one is discriminator that tries to find the closest match for the generated sample to that of the original sample and identifies the dissimilarities between the two. So we can say that the generator is described to fool the discriminator. The Generator generates the linguistic features of the given text and discriminator optimizes the original feature vector and generated the feature vector
Sangramsingkayte/Audio-Feature-Extraction
In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an MFC.
Sangramsingkayte/Speech-Synthesis-System
Language is the structural form of sharing thoughts and emotions in humans. The research motivates to stroke up for the Human-computer interaction. The overall intention of my PhD research program is focused to design Concatenation and Hidden Markov Model (HMM) based speech synthesis for the Marathi language. This will facilitate to correspond to the system and extend the technology for assertive devices based on the Marathi language. The advantage and attractive feature of the HMM system are that the voice alteration can be performed without large databases. To understand the detailed study of Synthesis techniques, I have also implemented the system for Unit Selection method. The Marathi Talking calculator is published at Play store using the technique of concatenation. This calculator performs the basic arithmetic operations and additionally speaks out the numeral in Marathi as the key is pressed. The result box synthesis the voice and speaks out the result in Marathi with correct place value of digits. The weakness of USS is it requires a large database and at joins, the quality is affected. To overcome these issues, the study reveals the built-up of a system with a phonetic based approach for Marathi Language using Concatenation and HMM.
Sangramsingkayte/Stroke-Prediction
Machine Learning is the fastest-growing technique in many fields and the healthcare industry is no exception to this. Machine Learning algorithms plays an essential role in predicting the presence/absence of Heart diseases, tumors, and more. Such required information, if predicted well in advance, can provide important insights to doctors who can then adapt their diagnosis and treat the patient accordingly. World Health Organization has estimated 12 million deaths occur worldwide, every year due to heart diseases. Half the deaths in the United States and other developed countries are due to cardiovascular diseases. The early prognosis of stroke diseases can aid in making decisions on lifestyle changes in high-risk patients and in turn reduce the complications. If it is about to identify the relationship and factors affecting it can cured n advance time. This research intends to pinpoint the most relevant/risk factors of heart disease as well as predict the overall risk using logistic regression. In this report, I'll discuss the prediction of stroke using Machine Learning algorithms. The algorithm I have implemented is logistic regression on the Health
Sangramsingkayte/VBx
Variational Bayes HMM over x-vectors diarization
Sangramsingkayte/Gammatone-like-spectrograms
Gammatone filters are a popular linear approximation to the filtering performed by the ear. This routine provides a simple wrapper for generating time-frequency surfaces based on a gammatone analysis, which can be used as a replacement for a conventional spectrogram. It also provides a fast approximation to this surface based on weighting the output of a conventional FFT.
Sangramsingkayte/End-to-End-Neural-Diarization
Matlab-Voice-Record-and-plot-FFT-Real-Time
Sangramsingkayte/HTK-features-in-Python
HTK features in Python This project contains a Python implementation of the MFCC features as computed by HTK.
Sangramsingkayte/Image-Caption-using-CNNs-and-RNNs-
Image Caption Generator using CNNs and RNNs¶
Sangramsingkayte/Speech-Processing-Basic-Concepts
Basic Concepts: Articulatory Phonetics – the development and classification of speech sounds; Acoustic Phonetics – the acoustics of speech production; Review of Digital Signal Processing concepts; Short-Time Fourier Transform, Filter-Bank, and LPC Methods Techniques for Speech Analysis: Features, Feature Extraction, and Pattern Comparison: Log Spectral Distance, Cepstral Distances, Weighted Cepstral Distances and Filtering, Likelihood Distortions, Spectral Distortion using a Warped Frequency Scale, LPC, PLP, and MFCC Coefficients are both statistical and perceptual speech distortion measures. Multiple Time – Alignment Paths, Dynamic Time Warping, and Time Alignment and Normalization Remarks
Sangramsingkayte/TextPrediction
Recent Google and Facebook focused on behind-the-scenes mechanisms of text prediction. In addition to using Recurrent Neural Network and Long Short-Term Memory Networks for the motivation, there were two word2vec models for generating word embeddings also discussed.
Sangramsingkayte/AwesomeDiarization
Sangramsingkayte/Books
Sangramsingkayte/Data-Scientist-case
large food retailer and have received sales data on clients and the products they purchase from the marketing department. They want your help to analyse the data and provide key recommendations that will guide their marketing strategy.
Sangramsingkayte/Deep-Learning-for-NLP
Artificial Neural Networks and Deep Learning are. Also, some neural network structures for exploiting sequential data like text or audio
Sangramsingkayte/Emotion_In_Text
Finding emotions in text.an emotion annotation task of identifying emotion category, emotion intensity and the. words/phrases that indicate emotion in text.Preliminary results of emotion classification experiments show the accuracy of 73.89%, significantly above the baseline.
Sangramsingkayte/git-commands
Sangramsingkayte/Git_Commands
Git Commands
Sangramsingkayte/Hands-on-Machine-Learning
Sangramsingkayte/Image_Captioning-
Image Captioning with Keras and TensorFlow
Sangramsingkayte/Learning_Python_Exercise
Learning_Python_Exercise
Sangramsingkayte/mp3_to_wav
Convert multiple MP3 audio files in a folder to WAV format (with mono type) using python code.
Sangramsingkayte/Natural-Language-Processing
Natural Language Processing, or NLP for short, is broadly defined as the automatic manipulation of natural language, like speech and text, by software and basically a subset of machine learning that lets us extract insights from text data.
Sangramsingkayte/NLP-with-Python-for-Machine-Learning-Essential-Training
NLP/ML Code
Sangramsingkayte/ResearchRocks
Sangramsingkayte/sph2wav
sph2wav
Sangramsingkayte/sph2wav_Speech
Sangramsingkayte/TextGeneration
Sangramsingkayte/voice_activity_detection