/SleepBoost

A Multi-Level Tree-based Ensemble Model for Automatic Sleep Stage Classification

Primary LanguagePythonMIT LicenseMIT

SleepBoost: A Multi-level Tree-based Ensemble Model for Automatic Sleep Stage Classification

Abstract


SleepBoost

Neurodegenerative diseases often exhibit a strong link with sleep disruption, highlighting the importance of effective sleep stage monitoring. In this light, Automatic Sleep Stage Classification (ASSC) plays a pivotal role, now more streamlined than ever due to the advancements in deep learning (DL). However, the opaque nature of DL models can be a barrier in their clinical adoption, due to trust concerns among medical practitioners. To bridge this gap, we introduce SleepBoost, a transparent Multi-level Tree-based Ensemble Model specifically designed for ASSC. Our approach includes a crafted Feature Engineering Block (FEB) that extracts 42 time and frequency domain features, out of which 17 are selected based on their high mutual information score (>0.23). Uniquely, SleepBoost integrates three fundamental linear models into a cohesive multi-level tree structure, further enhanced by a novel reward-based adaptive weight allocation mechanism. Tested on the Sleep-EDF-20 dataset, SleepBoost demonstrates superior performance with an accuracy of 86.3%, f1-score of 80.9%, and a Cohen kappa score of 0.807, outperforming leading DL models in ASSC. An ablation study underscores the critical role of our selective feature extraction in enhancing model accuracy and interpretability, crucial for clinical settings. This innovative approach not only offers a more transparent alternative to traditional DL models but also extends potential implications for monitoring and understanding sleep patterns in the context of neurodegenerative disorders.

Overview of the Method


SleepBoost
General architecture of SleepBoost. We trained Random FOrest (RF), Light Gradient Boosting (LGBoost), and Categorical Boosting (CatBoost) as a unit block model for SleepBoost using the training dataset. Adaptive weight calculation is initiated using the prediction of the unit block models. Finally, a weighted score is calculated to predict the sleep stage

Description of the GitHub Repository

This repository contains all the necessary code, data, and instructions to replicate the findings of our paper. The repository is structured as follows:

  • src/: Source code used in the research.
  • figures/: Figures and graphs used in the paper.
  • supplementary/: Supplemnetary material of this work.

Prepare dataset

We evaluated our SleepBoost with Sleep-EDF dataset.

For the Sleep-EDF dataset, you can run the following scripts to download SC subjects.

cd data
chmod +x download_physionet.sh
./download_physionet.sh

Then run the following script to extract specified EEG channels and their corresponding sleep stages.

python prepare_physionet.py --data_dir data --output_dir data/eeg_fpz_cz --select_ch 'EEG Fpz-Cz'
python prepare_physionet.py --data_dir data --output_dir data/eeg_pz_oz --select_ch 'EEG Pz-Oz'

Create a virtual environment with venv/conda

python3.11 -m venv sleepboost
source ./sleepboost/bin/activate
python3 -m pip install -r requirements.txt

Citation

If you use our code or methodology in your work, please cite our paper as follows: