This repo contains the PyTorch implementation of the paper: Contrastive Predictive Coding for Human Activity Recognition, accepted at IMWUT 2021.
Some .py
files have been modified to support a new experimental scenario for the paper: Impact of Pre-training Datasets on Human Activity Recognition with Contrastive Predictive Coding.
Contrastive Predictive Coding (CPC) is a self-supervised learning (SSL) technique that has demonstrated promising results in several tasks, including human activity recognition (HAR). In this work, we explore the impact of data variety on backbone pre-training when designing CPC models for HAR and the benefits of pre-training on the final model.
The Activity Recognition Chain (ARC) comprises five distinct steps: data collection, pre-processing, windowing, feature extraction, and classification.
In this work, we focus on the fourth step—feature extraction—and accomplish it via self-supervision.
We first split the dataset by participants to form the train-val-test
splits. Of the available participants, 20% are randomly chosen for testing, while the remaining 80% are further divided randomly into the train and validation splits at an 80:20 ratio. The training data is normalized to have zero mean and unit variance, which is subsequently applied to the validation and test splits as well.
The sliding window process is applied to segment one second of data with 50% overlap. On these unlabeled windows, we first pre-train the CPC model, and transfer the learned encoder weights to a classifier for activity recognition (or fine-tuning). During classification, the learned encoder weights are frozen, and only the MLP classifier is updated using cross-entropy loss with the ground truth labels.
Additionally, we evaluated the impact of data variety on model pre-training using 15 combinations of four distinct HAR datasets, finding significant performance variability based on the pre-training datasets.
In this repo, pre-training can be performed using the following command:
python main.py --dataset UCI_Raw
The trained model weights are saved to the models/<DATE>
folder, where <DATE>
corresponds to the date the training was completed. By specifying the dataset as UCI_Raw
, the processed data is utilized based on the location specified in arguments.py
.
The pre-training logs are saved in the saved_logs/<DATE>
folder and contain the losses as well as the accuracy of the CPC pre-training.
Once the model training is complete, the saved weights can be used for classification by running:
python evaluate_with_classifier.py --dataset UCI_Raw --saved_model <FULL_PATH_TO_MODEL.pkl>
The classification logs are saved in the saved_logs/<DATE>
folder and contain the losses, accuracy, and F1-scores for the classification.
Another option is to use:
python run_experiments.py --saved_model <FULL_PATH_TO_MODEL.pkl>
This command runs python evaluate_with_classifier.py
for all four target datasets (KuHar, MotionSense, UCI, and RealWorld_waist) with different percentages of target dataset data, outputs the results to a .csv
file, and plots the curve.
File name | Function |
---|---|
arguments.py |
Contains the arguments used for pre-training and classification. |
dataset.py |
Dataset and data loaders to load the pre-processed data and apply the sliding window procedure. |
evaluate_with_classifier.py |
To train the classifier on the learned encoder weights. |
main.py |
Over-arching file for the pre-training. |
meter.py |
For logging metrics and losses. |
model.py |
Defines the model and the CPC loss. |
run_experiments.py |
Runs the classification experiments on multiple datasets with different percentages of target data and logs the results. |
sliding_window.py |
Efficiently performs the sliding window process on numpy (adapted from this blog). |
trainer.py |
Contains methods for training the CPC. |
utils.py |
Utility functions. |
This software is released under the GNU General Public License v3.0.
This project is based on CPC implementations from the following repositories:
- Contrastive Predictive Coding for Human Activity Recognition
- Contrastive-Predictive-Coding-PyTorch
- emotion_detection_cpc
These repositories were instrumental in the development of this project.