Blood glucose Forecasting Approaches

This repository contains simple approaches to train blood glucose forecasting models given the patient's past. Models are trained and validated on the OhioT1DM dataset (2018 and 2020). All approaches are visible in various Jupyter notebooks.

Requirements

The "Ohio Data/" folder must be in the repository root directory with the following structure:

project/
|
|...
|
|--Ohio Data/
   |
   |--Ohio2018/
   |  |
   |  |--test/
   |  |  |
   |  |  |{patient_id}-ws-testing_processed.csv
   |  |
   |  |--train/
   |     |
   |     |{patient_id}-ws-training_processed.csv
   |
   |--Ohio2018/
   |  |
   |  |--test/
   |  |  |
   |  |  |{patient_id}-ws-testing_processed.csv
   |  |
   |  |--train/
   |     |
   |     |{patient_id}-ws-training_processed.csv

# Python 3.8 or higher

pip3 install -r requirements.txt

jupyter-lab

The following sections describe my approach.

Research

Prior Knowledge

Most of my experience is in the field of computer vision. When it comes to tasks related to time series, I only have experience with anomaly detection. I have rarely used sequential neural networks, such as RNNs or LSTMs. In contrast, I was able to gain a lot of experience with CNNs.

Pytorch Forecasting

I found this framework just by coincidence. It seems to be the equivalent of what fastAI is for Pytorch (or even Keras is for TensorFlow). Besides forecasting models, the framework implements various features for data preparation.

For example:

Temporal time encoding
Sampling of missing values
Dataloader generator

(Blood glucose) Forecasting Papers

Unfortunately, as I am not a student anymore, I am unable to read some papers without being charged. Therefore I am limited to free paper I can find on the internet. Nevertheless, here is a list of papers I read or at least took a look at for preparation:

Temporal Fusion Transformer
- A derivation of the classical transformer model, specifically designed for time series forecasting
- Distinguishes between categorical and continuous data
- Can also take future known values as input
- Achieves state-of-the-art results in forecasting tasks
N-Hits
- An enhanced version of N-BEATS
- Instead of N-BEATS, N-Hits predicts interpolation coefficients to interpolate values across a time series
- Also utilizes average pooling layers per block for
Using N-BEAT to forecast blood glucose values
- Uses a customized N-BEATS model to predict blood glucose values
- The major difference is to include an LSTM inside the blocks
- Also uses a customized loss function
Using GANS
- The generator generates the future blood glucose values up to a defined prediction horizon
- The discriminator discriminates between ground truth and generated blood glucose values
- My opinion:
  - Even though the results from the authors look promising, I can not imagine that this approach can beat other models (LSTM, Transformer, ...)
  - In the past, I experienced how hard it can be to train GANs
  - Furthermore, they require a lot of computational resources
Comparison of different methods for blood glucose prediction
- A comparison between many approaches for blood glucose forecasting
- An LSTM Ensemble model achieved the best results

Reinforcement Learning

As the task states: deep reinforcement learning (DRL) can be used to solve this task. Intuitively this does not make much sense, as DRL is usually used to maximize a future outcome instead of just predicting future values. Many papers regarding stock trading describe their approach of using DLR techniques by taking past and current time series of stock prices. The major difference between trading strategies and general time series forecasting is that trading strategies aim to maximize their future portfolio value instead of just predicting the stock prices. Therefore it makes sense to use DRL for this task.

In this task, on the other hand, it would make sense if you would, e.g. measure what happens if you take treatment.

Furthermore, I could not find a single paper regarding blood glucose forecasting using DRL methods. I rarely find any prior time series forecasting work using reinforcement learning in general. Therefore I will only use "traditional" methods to solve this task.

Prior Knowledge

DRL:
- Unfortunately I have close to zero prior knowledge about DRL
- I know some basic terms (MDP, reward, Bellman Equation, return, ...)
- I tried out Deep Q Learning for simple tasks
- I read the paper of Alpha Zero out of curiosity. I understood the basic ideas but never reimplemented it on my own
- I also read the paper of ReBel in the past

Approach

Data preparation

First of all, I took a deeper look into the data set. My data analysis is viewable in the "data.ipynb" notebook.

Summary

The data only contains continuous data, 5-minute timestamps, and no(!) categorical data
Many values (from target and non-target columns) are missing
- There are extremely sparse columns like "carbInput" (98% missing)
- But I believe that they can still contribute to the forecasting (e.g. after carbInput -> blood glucose should increase)
- Except for the target column (cbg), missing values are interpolated by cubic splines
  - (One can argue that it doesn't make sense to interpolate, e.g. carbInput, because how would you interpolate if someone just ate?)
- Because models work better with values between 0 and 1, all values are scaled accordingly (divided by max values of train set)
Correlations:
- I scatter plotted the relations between every column and the respective cbg values
- I noticed that the finger value is not as accurate as I expected (I expected an almost straight line)
- Outliers:
  - There are only a few clear outliers (one carbInput, some hr)
  - Unfortunately, after looking at the points in time where they appear, I could not determine why this is the case
  - One can argue that removing them would be reasonable, but I decided to include them
Added data:
- Some models perform better when taking temporal information as input
- Therefore I added positional embedded information, using sin/cos embedding
Removed data:
- If a data pair (input, label) contains at least one point with at least one missing cbg value, it is removed from the dataset

Metrics

I focussed on metrics other researchers used to evaluate their models (rMSE and MAE). The models I used took as input 24 past time steps (12hours) and a prediction horizon of 6/12 time steps (30min/60min).

Models

For each model, there is a separate notebook where I explain my approaches ({model_name}_approach.ipynb).

Models:

N-BEATS (plain)
- 12 blocks
- loss function as described in this paper
N-BEATS (paper) (from paper)
- 12 blocks
- loss function as described in this paper
LSTM
- teacher enforecement is enabled for training
- non bidirectional
- single layer
LSTM
- teacher enforecement is enabled for training
- bidirectional
- two layers
Ensemble (LSTM, N-BEATS)

Results

N-Beats (plain)

	Prediction Horizon
	30 Minutes		60 Minutes
Participant ID	rMSE	MAE	rMSE	MAE
559	28.83	20.84	39.	29.27
563	24.96	18.64	33.40	25.22
570	24.20	18.52	34.15	26.53
575	26.90	19.57	35.62	26.76
588	26.03	19.03	34.52	25.39
591	25.76	19.76	34.26	26.80
540	34.74	26.04	43.78	32.84
544	24.36	18.58	34.62	27.24
552	25.24	18.59	32.35	24.79
567	31.78	23.35	41.11	31.40
584	30.27	22.77	39.93	30.68
596	24.59	18.34	34.46	26.16
mean	20.34	27.50	36.67	27.76

N-Beats advanced

	Prediction Horizon
	30 Minutes		60 Minutes
Participant ID	rMSE	MAE	rMSE	MAE
559	43.75	32.76	54.67	41.03
563	32.04	24.45	38.03	29.83
570	38.59	31.39	54.13	45.11
575	37.1	29.01	44.85	36.32
588	34.23	25.48	40.60	30.91
591	34.44	27.93	40.16	32.94
540	45.19	34.26	51.76	39.70
544	37.29	31.17	43.83	36.83
552	34.60	28.25	40.61	33.35
567	41.34	32.99	46.82	38.01
584	40.29	31.77	48.77	39.64
596	35.45	27.35	40.97	32.42
mean	31.05	39.11	45.77	36.34

Plain LSTM

	Prediction Horizon
	30 Minutes		60 Minutes
Participant ID	rMSE	MAE	rMSE	MAE
559	14.88	9.81	35.17	26.02
563	14.71	9.96	29.45	22.01
570	12.18	8.31	28.45	21.73
575	16.94	10.51	31.57	23.57
588	14.60	9.92	30.13	22.36
591	16.37	11.10	31.08	23.72
540	18.36	12.42	38.44	28.69
544	13.97	9.60	31.11	24.78
552	13.62	9.11	28.45	21.85
567	17.86	11.65	36.71	27.37
584	16.62	11.31	32.98	25.02
596	13.98	9.33	29.42	21.95
mean	15.44	10.25	32.07	24.01

Multistacked Bidirectional LSTM

	Prediction Horizon
	30 Minutes		60 Minutes
Participant ID	rMSE	MAE	rMSE	MAE
559	25.981	19.40	46.13	34.09
563	23.52	18.60	35.48	27.68
570	24.72	20.73	49.01	39.66
575	24.89	18.60	37.92	29.56
588	23.35	17.93	36.92	29.35
591	22.99	17.77	45.64	34.30
540	26.27	19.62	36.28	28.08
544	21.29	15.45	35.97	28.31
552	20.17	14.93	40.57	31.21
567	25.13	18.13	40.57	31.21
584	25.31	18.75	42.21	31.90
596	19.98	14.92	35.98	27.42
mean	23.72	17.90	40.40	31.00

Ensemble Model (Plain LSTM and Plain N-BEATS)

	Prediction Horizon
	30 Minutes		60 Minutes
Participant ID	rMSE	MAE	rMSE	MAE
559	14.23	9.22	26.52	17.27
563	13.91	9.00	20.64	13.92
570	12.27	8.27	21.87	15.29
575	16.19	9.88	25.53	17.53
588	13.79	9.15	21.08	14.45
591	15.49	10.22	23.98	16.81
540	16.13	10.83	28.51	19.34
544	13.16	8.90	22.89	16.43
552	12.16	8.40	21.09	14.50
567	15.66	10.30	27.21	18.05
584	15.52	10.33	24.83	16.71
596	12.84	8.45	20.46	13.85
mean	14.35	9.41	23.87	16.18

Discussion

I don't believe the results are accurate. I could not find a single paper with better results than my described approach, but because I only spent a relatively short amount of time on this task, it is hard to believe to achieve the best results with simple models. I believe this is because I removed all data where missing cbg values are contained. I checked out the repository to compare where my mistakes are but couldn't find any.

Pytorch Forecasting

I also created a notebook, using the library to create a temporal fusion transformer. I did not focus on the results of this approach because I just adapted the tutorial on the docs, which would not have shown my skills in Deep Learning. The resulted model even outperformed all previously mentioned models. If I would have had more time to solve this task, I would have studied the methods of this library further to improve my results.

Future Work

Due to the limited time I had to solve this task, there is much more to try out. I only implemented simple methods and ideas. Here is a list of things I would do to improve the given results.

Data Processing:
- Using the ARIMA model to interpolate the missing feature values instead of using splines
- Since ARIMA is pretty powerful, I believe that it can supplement more accurate values
- Data Augmentation: I don't know any method how it is possible, but I could spend some research
Hyperparameter Tuning:
- I only "guessed" good parameters instead of using well-known techniques like grid search or bayesian search
- There is the library optuna which offers such functionality
- Alternatively, I could just use sklearn
Regularization:
- As I mentioned in the Data preparation section, there exist a few outliers
- To not overfit them, one can use regularization methods (dropout, batch norm, layer norm, etc.)
Temporal Fusion Transformer
- As mentioned, the transformer can outperform other models
N-fold cross validation:
- I did not include this, because I had limited computational resources

MBus123/blood_glucose_forecasting

Blood glucose Forecasting Approaches

Requirements

Research

Prior Knowledge

Pytorch Forecasting

(Blood glucose) Forecasting Papers

Reinforcement Learning

Prior Knowledge

Approach

Data preparation

Summary

Metrics

Models

Results

N-Beats (plain)

N-Beats advanced

Plain LSTM

Multistacked Bidirectional LSTM

Ensemble Model (Plain LSTM and Plain N-BEATS)

Discussion

Pytorch Forecasting

Future Work