- Please head to datalink, fill up the google form below the download section and download it
- The Dataset is UrbanSound8k, which has 1000 Audio Samples for each class, 4 sec long.
- Audio is a term used to describe any sound or noise in a range the human ear is capable of hearing. Measured in hertz/ Khertz
- An audio signal is a representation of sound, typically using either a changing level of electrical voltage for analog signals, or a series of binary numbers for digital signals.
- DSP : Digital Signal Processing, is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations.
- DSP Performs many mathematical calculation, like STFT, LTFT onto the Audio Signal to bring out the insights from it, which a normal human cannot perceive from MEL/MFCC Graphs
- MEL Spectrogram is composed of 2 words MEL and Spectrogram
- MEL : The Mel Scale, mathematically speaking, is the result of some non-linear transformation of the frequency scale. This Mel Scale is constructed such that sounds of equal distance from each other on the Mel Scale, also “sound” to humans as they are equal in distance from one another.
- Spectrogram : It is the Spectra of all those frequencies with MEL Scale