COPD Detection

Detection of COPDs from the ICBHI 2017 Challenge dataset.

Note: Detailed information regarding the dataset generation, preprocessing, augmentation and prediction pipeline can be found in The Report.

Motivation

Medical professionals typically demonstrate low accuracies at detecting the presence of COPDs using Auscultation sounds Yoonjoo Kim et al., 2021.
Need for liteweight, accurate mechanism for detection.

Aim

Develop an ML model to utilise demographic and audio data to predict presence and class of COPDs.
Exceed peak human baseline of 81% (demonstrated by Fellows, as noted by the aforementioned paper).

Methodology

Generate demographic data from within the ICBHI dataset by filling missing values.
Add crackling/wheezing information to dataset by reading audio file descriptions.
Find out how well classification can be performed with oracle data about crackling/wheezing within the audio.
Generate audio datasets by performing data augmentations - noisy, pitchshift, timeshift, timestretch audio files. Dataset expanded to size 4600.
Create Mel Frequency and Chroma Based features from the data:
- Mel Frequency Spectrograms: Extract non musical, granular information about power distribution for each frequency at a given timestep for a given audio file, on the Mel scale. Useful for detecting short bursts of audio (like crackling).
- Mel Frequency Cepstral Coefficients (MFCCs): Transformed version of the spectrogram. Useful for detecting phenomes.
- Chromagrams: Capture the energy distribution of musical pitches across time. Useful for detecting musical sounds (like wheezing).
- Chroma Energy Normalised Statistics (CENS): Transformed version of Chromagram. More robus to variations in loudness and instrumentation.
Generate audio features' graphs as inputs for a CNN architecture, and simultaneously flatten features to form 1D arrays of the representations, which can be utilised via numerical models as features.
Perform PCA for cumulative explained variance of 95% to reduce number of features significantly.

Results

CNNs performed well on the base dataset, achieved 88% testing accuracy using MFCCs. Failed to generate results for augmented dataset due to computational restraints.
SVMs perfomed almost equally well on augmented dataset.
MFCCs appear to be best discriminator among audio features.

divyajeettt/COPD-Detection