Audio Data Analysis Using Deep Learning

Audio data analysis is about analyzing and understanding audio signals captured by digital devices, with numerous applications in the enterprise, healthcare, productivity, and smart cities.

In this repository we have done audio data analysis and extracted necessary features from a sound/audio file. Also build an Artificial Neural Network(ANN) and Convolutional Neural Network(CNN) for classifying music genre.

NN

Pre requisites:

  1. Librosa: to analyze audio signals in general but geared more towards music. It includes the nuts and bolts to build a MIR(Music information retrieval) system.

pip install librosa

  1. IPython.display.Audio: Lets you play audio directly in a jupyter notebook.

Important termanologies:

Spectrogram

A spectrogram is a visual way of representing the signal strength, or “loudness”, of a signal over time at various frequencies present in a particular waveform.

Feature extraction from Audio signal

Every audio signal consists of many features. However, we must extract the characteristics that are relevant to the problem we are trying to solve. The process of extracting features to use them for analysis is called feature extraction. Let us study a few of the features in detail.

The spectral features (frequency-based features), which are obtained by converting the time-based signal into the frequency domain using the Fourier Transform, like:

  1. Spectral centroid
  2. Spectral Rolloff
  3. Spectral Bandwidth
  4. Zero-Crossing Rate
  5. Mel-Frequency Cepstral Coefficients(MFCCs)
  6. Chroma feature

Please consider reading these articles to understand Audio data analysis and its step by step implementation here and here.