In this project, we tried to perform Human Activity Recognition using time-series numerical data generated by a Wireless Sensor Network (WSN). The dataset consists of 7 different activities!
During this project, we applied several important techniques which are known to work well in time-series classification problems. At first, we extracted the statistics which are known to be important for Human Activity Recognition time series problems. Out of those statistics, we evaluated their importance in time series classification using Recursive Feature Elimination and L1 regularization during the training of our Machine Learning algorithms. We observed that L1 regularization can significantly increase the computational speed of the experiments while concurrently having a satisfactory performance as a Feature Elimination method compared to RFE.
Additionally, since we had data recordings from 88 persons, we tried to split the time series in order to artificially create more instances. The optimum number of splits was decided using k-fold CV. We surprisingly found out that for some models, creating artificial instances with the aforementioned method could significantly boost their performance.
In the following table, you can see the results of the Human Activity Recognition as well as the optimum number of splits of the time-series data.
Table: Final Results
Model | Best Time Series Split | CV Accuracy | Test Accuracy |
---|---|---|---|
Logistic Regression with L1 Regularization | 2 | 84.75 % | 84.21 % |
Naive Bayes with Gaussian Likelihood | 2 | 87.03 % | 89.47 % |
Multinomial Naive Bayes | 1 | 85.55 % | 89.47 % |