This notebook provides all code needed to process the data from the Kaggle EEG-grasp-and-lift competition, build a model, and train it. There are additional functions for visualising the data too. Play around and see if you can beat my best scores on the validation set (scroll down to see them).
Launch a FloydHub Workspace to start training this model with 1 click. Workspace is a GPU-enabled cloud IDE for machine learning. It provides a fully configured environment so you can start hacking right away, without worrying about dependencies, data sets, etc.
Once you've started the workspace, run the 'start_here' notebook or type 'floyd run' into the workspace terminal. This will begin to train the model on the sample dataset.
import pandas as pd
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn
from torch.autograd import Variable
import torch.nn.functional as F
from torch.nn.utils import weight_norm
import random
import matplotlib.pyplot as plt
from sklearn.metrics import roc_auc_score as auc
from scipy.interpolate import spline
LEt's load some CSVs so we can see how they are formatted... There are two CSVs for each series, showing the data and the events. A quick look at the CSVs and we can see there are 32 readings for each time frame, and then in the events csv we have 6 labels columns that will be 0 or 1 depending on whether the action is occurring.
data = pd.read_csv('./train/subj1_series1_data.csv')
ev = pd.read_csv('./train/subj1_series1_events.csv')
data.head()
id | Fp1 | Fp2 | F7 | F3 | Fz | F4 | F8 | FC5 | FC1 | ... | P7 | P3 | Pz | P4 | P8 | PO9 | O1 | Oz | O2 | PO10 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | subj1_series1_0 | -31 | 363 | 211 | 121 | 211 | 15 | 717 | 279 | 35 | ... | 536 | 348 | 383 | 105 | 607 | 289 | 459 | 173 | 120 | 704 |
1 | subj1_series1_1 | -29 | 342 | 216 | 123 | 222 | 200 | 595 | 329 | 43 | ... | 529 | 327 | 369 | 78 | 613 | 248 | 409 | 141 | 83 | 737 |
2 | subj1_series1_2 | -172 | 278 | 105 | 93 | 222 | 511 | 471 | 280 | 12 | ... | 511 | 319 | 355 | 66 | 606 | 320 | 440 | 141 | 62 | 677 |
3 | subj1_series1_3 | -272 | 263 | -52 | 99 | 208 | 511 | 428 | 261 | 27 | ... | 521 | 336 | 356 | 71 | 568 | 339 | 437 | 139 | 58 | 592 |
4 | subj1_series1_4 | -265 | 213 | -67 | 99 | 155 | 380 | 476 | 353 | 32 | ... | 550 | 324 | 346 | 76 | 547 | 343 | 446 | 171 | 67 | 581 |
5 rows × 33 columns
ev.head()
id | HandStart | FirstDigitTouch | BothStartLoadPhase | LiftOff | Replace | BothReleased | |
---|---|---|---|---|---|---|---|
0 | subj1_series1_0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | subj1_series1_1 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | subj1_series1_2 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | subj1_series1_3 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | subj1_series1_4 | 0 | 0 | 0 | 0 | 0 | 0 |
The below function reads each csv in the train directory, and stacks them into two numpy matrices. One with the readings from the EEG (self.data) and one with the corresponding class labels (self.target).
class eegdata(Dataset):
def __init__(self, path, validation=False, subjects=range(1,13)):
super().__init__()
data, self.target = self.readfiles(path, validation, subjects)
self.mean= data.mean(axis=0)
self.std = data.std(axis=0)
self.data = self.norm(data)
def norm(self, x):
return ((x - self.mean)/self.std)
def to_np(self, values):
#get total lines of data count
count = 0
for i in range(len(values)):
count += len(values[i])
#create np array size of all data
ret = np.zeros((count, len(values[0][0])))
#copy data into np array
ix = 0
for i in range(len(values)):
ret[ix:ix+len(values[i]), :] = values[i]
ix += len(values[i])
return ret
def readfiles(self, path, validation, subjects):
allx = []
ally = []
series = [1,2,4,5,6,7,8] if validation == False else [3]
for i in subjects:
print('log: reading subject {}...'.format(i))
xs = None
ys = None
for j in series:
data = 'subj{}_series{}_data.csv'.format(i,j)
events = 'subj{}_series{}_events.csv'.format(i,j)
x = pd.read_csv(path + data).values[:, 1:]
xs = x if xs is None else np.vstack((xs, x))
y = pd.read_csv(path + events).values[:, 1:]
ys = y if ys is None else np.vstack((ys, y))
allx.append(xs)
ally.append(ys)
xs = self.to_np(allx)
ys = self.to_np(ally)
return xs, ys
def __getitem__(self, index):
return self.data[index], self.target[index]
def __len__(self):
return len(self.data)
# load all subjects data into one big array using object created above
traindata = eegdata('./train/')
valdata = eegdata('./train/', validation=True)
# also create individual datasets for subjects
def get_individuals():
subjects = []
vals = []
for i in range(1,13):
s = [i]
subject = eegdata('./train/', subjects=s)
v = eegdata('./train/', validation=True, subjects=s)
subjects.append(subject)
vals.append(v)
return subjects, vals
subjects, vals = get_individuals()
The below functions let you pick an event and a subject and see how the 32 voltage channels readings change over a time window of 200ms before the event until 300ms after.
To begin looking at the data I picked the first event 'HandStart'. This refers to when a light flashes, telling the subject to begin the grasp and lift. Looking at the first subject, this figure shows how the 32 values changed during the time window.
Already we can see a pattern and if we average all the 32 values, it suddenly becomes much more clear.
It seems HandStart is accompanied by a general lowering of the overall voltage readings followed by a spike. The model should be able to recognise these patterns when making predictions. Importantly, the data has a similar shape when we look at another subject:
With such clear patterns, you might wonder who even needs a machine learning algorithm? Yet unfortunately things become less obvious as we look at the rest of the subjects. Scroll down to see averages from all subjects.
# function to see how readings change on average over course of an event
def get_spike(subject, event, average=True):
spike = np.zeros([19,32])
s = subject
indexes = np.where(subjects[s].target[:,event] == 1)[0]
i = 0
div = 0
while i < len(indexes):
ix = subjects[s].data[indexes[i]-100:indexes[i+149]+50]
spike += ix[::16]
div += 1
i += 150
spike /= div
if average == True:
spike = spike.mean(axis=1)
return spike
# plot figures for how readings change across all 12 subjects for 'HandStart' action
fig = plt.figure(12, figsize=[10,10])
for s in range(12):
ax = fig.add_subplot(3,4,s+1)
spike = get_spike(s, 0)
xnew = np.linspace(0,len(spike),19) #300 represents number of points to make between T.min and T.max
smooth = spline(xnew,spike,xnew)
ax.plot(range(len(spike)), spike)
ax.set_xticklabels([])
ax.set_yticklabels([])
/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:8: DeprecationWarning: `spline` is deprecated!
spline is deprecated in scipy 0.19.0, use Bspline class instead.
Though the downward and upward spike is seen relatively consistently across the board, it's shape does vary considerably and apart from an overall increase in activity, a discernible pattern is not readily made out by the human eye.
# some parameteres for the model
num_features = 32
window_size = 1024
batch_size=2000
# Batch creator. When training it will return random locations in the dataset. The data is a time series
# and so we feed previous readings (going back window_size) in with each index. Rather than feed in all window_size
# previous readings, we subsample and take every 4th set of readings.
def get_batch(dataset, batch_size=2000, val=False, index=None):
if val == False:
index = random.randint(window_size, len(dataset) - 16 * batch_size)
indexes = np.arange(index, index + 16*batch_size, 16)
else:
indexes = np.arange(index, index + batch_size)
batch = np.zeros((batch_size, num_features, window_size//4))
b = 0
for i in indexes:
start = i - window_size if i - window_size > 0 else 0
tmp = dataset.data[start:i]
batch[b,:,:] = tmp[::4].transpose()
b += 1
targets = dataset.target[indexes]
return torch.DoubleTensor(batch), torch.DoubleTensor(targets)
The key to understanding which action is taking place lies in seeing the current reading in its proper context. So for every reading we send into our machine learning model, we will also send in readings from across a previous time window. This way the model can 'see' the shape of the wave leading up to the current event.
Convolutional neural networks (CNN) work well with time series data, so this will be our model of choice. Though reading minds may seem like an incredibly complex art, high-level deep learning libraries like Pytorch mean we can actually code a model in 13 lines:
# convolutional network model we will train to detect patterns in readings. For more information see my tutorial here.
class convmodel(nn.Module):
def __init__(self, out_classes, drop=0.5, d_linear=124):
super().__init__()
self.conv2 = nn.Conv1d(32, 64, kernel_size=3, padding=0, stride=1)
self.bn = nn.BatchNorm1d(64)
self.pool = nn.MaxPool1d(2, stride=2)
self.linear1 = nn.Linear(8128, d_linear)
self.linear3 = nn.Linear(d_linear, out_classes)
self.dropout1 = nn.Dropout(drop)
self.dropout2 = nn.Dropout(drop)
self.dropout3 = nn.Dropout(drop)
self.conv = nn.Sequential(self.conv2, nn.ReLU(inplace=True), self.bn,\
self.pool, self.dropout1)
self.dense = nn.Sequential(self.linear1, nn.ReLU(inplace=True),self.dropout2,\
self.dropout3, self.linear3)
def forward(self, x):
bs = x.size(0)
x = self.conv(x)
x = x.view(bs, -1)
output = self.dense(x)
return torch.sigmoid(output)
The above gives the code for a simple (but effective…) CNN perfectly capable of tackling the job. It consists of a 1d convolutional layer, and two linear layers. While a full description of how these processes work is beyond the scope of this post, let's at least get a quick overview.
A convolutional layer simply consists of a kernel that slides along our data and applies its weights to the data values. This animation I made explains better than words alone:
In reality deep learning networks will have multiple kernels, producing multiple output arrays. In the network coded above, we use 64 kernels each with their own weights, and hence produce 64 different output vectors. Each of these kernels can react to a different kind of pattern in the input.
There will also be multiple input channels (32 in fact), each one containing the previous readings for each EEG position.
The outputs from the convolutional layer are then all lined up in a single vector and put through a neural network with two layers. Here is an animation of a simple two layer neural network with three inputs, a first layer with two nodes, and a final layer with one:
In our neural network the process is the same though only much bigger (8128 input nodes, 124 nodes in the first layer, and 6 nodes in the final layer). There are 6 final nodes because there are 6 possible actions, and we determine an action is happening if it's corresponding node scores above a certain threshold.
Two linear layers in a neural network is a pretty standard setup, it allows for XOR calculations (see this post) and means we don't reduce the input immediately from 8196 nodes to 6 nodes. Deciding the exact number of kernels and linear nodes is more of an art than science, and requires some fiddling around.
The weights in the convolutional kernel and the linear layers are initially random, but a process called back-propagation corrects them as we train the network on the data. For an in-depth tutorial on this process, I'd recommend Andrew Ng's free machine learning Coursera course and my tutorials.
When trained, the interactions between all these weights allows the network to detect patterns in data and make accurate predictions.
model = convmodel(6).double()
optim = torch.optim.Adadelta(model.parameters(), lr=1, eps=1e-10)
bs = batch_size
def train(traindata, epochs, printevery=1, shuffle=True):
model.train()
for epoch in range(epochs):
total_loss = 0
for i in range(len(traindata)//bs):
optim.zero_grad()
x, y = get_batch(traindata)
x = Variable(x).cuda()
y = Variable(y).cuda()
preds = model(x)
loss = F.binary_cross_entropy(preds.view(-1), y.view(-1))
loss.backward()
total_loss += loss.data[0]
optim.step()
if (i + 1) % printevery == 0:
print("epoch: %d, iter %d/%d, loss %.4f"%(epoch + 1, i + 1, len(traindata)//2000, total_loss/printevery), end='images/\r')
total_loss = 0
# train model for one epoch
train(traindata,1)
# save general model
!mkdir models
torch.save(model.state_dict(), "./models/Convnet_skip4_epoch1")
The below function calculates AUC scores on the validation set.
def getPredictions(data):
model.eval()
p = []
res = []
i = window_size
bs = 2000
while i < len(data):
if i + bs > len(data):
bs = len(data) - i
x, y = get_batch(data, bs, index=i, val=True, test=test)
x = (x)
preds = model(x)
preds = preds.squeeze(1)
p.append(np.array(preds.data))
res.append(np.array(y.data))
i += bs
preds = p[0]
for i in p[1:]:
preds = np.vstack((preds,i))
targs = res[0]
for i in res[1:]:
targs = np.vstack((targs, i))
return preds, targs
def valscore(data, test=False):
preds, targs = getPredictions(data, test)
aucs = [auc(targs[:, j], preds[:, j]) for j in range(6)]
total_loss = np.mean(aucs)
print(total_loss)
return preds, targs
# see how we scored on validation set
valscore(valdata)
So far we have trained a general model on all the data. To make predictions for each individual we will now fine tune the model on only that individual's data for one epoch. The function below loads the weights from the general model, finetunes on an individual, checks how it scores on the validation set and then saves the weights for that individual. We will use these weights later to assess our results.
def train_individuals(subjects, vals):
for i in range(12):
model.load_state_dict(torch.load("./models/Convnet_skip4_epoch1"))
train(subjects[i], 1)
print('')
valscore(vals[i])
torch.save(model.state_dict(), "./models/subj_{}_epoch2".format(i + 1))
# train general model on each subject, then get AUC score for validation set for that subject
train_individuals(subjects, vals)
Best results on validation set. Trained on all data 1 epoch, then just on each subject for 1 epoch, and validated on series 3.
Subject 1 AUC score on val set: 0.959168670213603
Subject 2 AUC score on val set: 0.9505101195220913
Subject 3 AUC score on val set: 0.9219539186384723
Subject 4 AUC score on val set: 0.917931166630155
Subject 5 AUC score on val set: 0.9352121018446913
Subject 6 AUC score on val set: 0.9419648786256208
Subject 7 AUC score on val set: 0.9567390746916029
Subject 8 AUC score on val set: 0.8484060070332817
Subject 9 AUC score on val set: 0.9373451906678737
Subject 10 AUC score on val set: 0.9494592292374859
Subject 11 AUC score on val set: 0.9120022844956828
Subject 12 AUC score on val set: 0.9483051008950621
Running this model over all the data once (taking less than an hour) was enough for the network to discern patterns in the EEG readings and generate a validation score with 94% AUC.
However as I was to learn, 94% AUC is not the same thing as 94% accuracy. AUC is used in academia to objectively compare results as it gives a score independent of class imbalances and thresholds. In its simplest terms, it gives the probability that the classifier will rank a randomly chosen positive example higher than a randomly chosen negative example (read more here).
Yet how does this translate into actual results? The best way to gain a true gage of how the model fared is to dig into some visuals.
Let's look at some data previously unseen by our model. In this series the subject performs the grasp and lift 28 times and the below figure shows when the subject carried out the actions (True) and when the model predicted the actions were occurring (Prediction).
# let's examine the results
model.load_state_dict(torch.load("./models/subj_1_epoch2"))
preds, targs = valscore(vals[0])
# f is feature, and th is threshold (at which point we consider the value a positive classification).
# In this case we are seeing how they did at 'FirstDigitTouch'. The figure shows that the model predicted
# an action was occuring 23 of 28 times, and thought an event was occuring when it wasn't once.
def show_whole_series(f=1, th=0.3):
ps = np.where(preds[:,f] > th)
true = np.where(targs[:,f] == 1)
fig = plt.figure(figsize=[15,1])
ax1 = fig.add_subplot(1,1,1)
ax1.scatter(ps[0]/500, ['Prediction']*len(ps[0]))
ax1.scatter(true[0]/500, ['True']*len(true[0]))
show_whole_series()
The model correctly predicts 23 out of the 28 lifts, and mistakenly predicts a lift when the subject isn't performing one.
If we closely examine a singular incident of the subject performing the grasp and lift, we can see how correctly the model predicted the exact action that was occurring. In the figure below we look at the third grasp and lift in the series; blue represents the prediction and orange when it actually occurred.
# printing some figures showing how successfully it predicted each event during a full grasp and lift sequence. This
# example looks at the third time subject 1 performed the task.
def get_indexes(data, event):
start = 0
count = 1
for i in range(len(data)):
if data[i + 1] > data[i] + 300:
if event == count:
return start, i
start = i + 1
count += 1
def show_results(fs=[0], trues=1, ps=[0], ths=[0.3],left=0,right=0):
for i in range(len(fs)):
f = fs[i]
th = ths[i]
p = np.where(preds[:,f] > th)
true = np.where(targs[:,f] == 1)
a_start, a_end = get_indexes(p[0], ps[i])
t_start = trues * 150 + 150
fig2 = plt.figure(figsize=[15,1])
ax1 = fig2.add_subplot(1,1,1)
ax1.scatter(p[0][a_start:a_end]/500, ['Prediction']*(a_end-a_start))
ax1.scatter(true[0][t_start:t_start+150]/500, ['True']*150)
ax1.set_xlim(left=left, right=right)
if left != 0 and right != 0:
ax1.set_title(ev.columns[f + 1])
show_results(fs=[0,1,2,3,4,5],trues=1, ps=[1,1,1,1,2,2], ths=[0.1,0.3,0.3,0.3,0.3,0.3], left=30.0, right=34.55)
I think these results are particularly impressive. Although the predictions don't line up exactly with the data labels, this is not important in a functional sense.
The orange bars reflect a time window of +-150ms around the action, and despite the model not predicting the entire time window, it does output a correct prediction when each event actually occurs. The model could hence be used to successfully predict when an action is happening.
Notably we see the model is better at predicting the time window before the action than after it. The brain activity for a motion occurs before the movement itself, as signals start in the brain and must make their way down to the hand, so perhaps this is to be expected.
The only exception to this pattern is seen in HandStart. This event refers to when a light flashes and the subject reacts to it. In this case the nervous activity would actually occur after the light flashes and so it makes sense that our model predicts the event after it happens.