dapowan/LIMU-BERT-Public

Dataset

Closed this issue · 6 comments

Can you please send the link to download the datasets?
I tried to use the UCI here https://archive.ics.uci.edu/ml/machine-learning-databases/00240/ and use the uci.py to preprocess the data but it seems that you data is different than this link.

Thanks for the comment. Links to the four datasets are already provided in the instructions (e.g., check section Dataset and click the UCI). The uci.py aims to process the raw IMU data in the UCI dataset, which can be found only in its extended version (http://archive.ics.uci.edu/ml/datasets/Smartphone-Based+Recognition+of+Human+Activities+and+Postural+Transitions). Thanks again for your interest.

No, they are different. It is 'labels.txt', which you can find at the end of the folder RawData.

Many thanks for your help. Can you please provide some info on how to use your code to transfer-learn to another small dataset? I have my recordings for 10 users, including acc and gyr data.

You can make it in this way:

  1. preprocess your dataset with the same format (including the file name and saved folder) as LIMU-BERT, you can find the details in the instructions. e.g., You may save "data_20_120.npy" and "label_20_120.npy" in the folder named "rahi".
  2. add the configs of your dataset into the file dataset/data_config.json. e.g., the dataset name should be "rahi_20_120" in dataset/data_config.json.
  3. add the dataset and config names at lines 393 and 394 in util.py. e.g., add the choice of "rahi" at line 393.
  4. run the scripts with your dataset. For example, command "pretrain.py v1 rahi 20_120 -s limu_v1".

Another way is to extract the core codes and replace the data, label with your own datasets. e.g., line 27 in pretrain.py.

Hope the two methods can help :-).

You can make it in this way:

  1. preprocess your dataset with the same format (including the file name and saved folder) as LIMU-BERT, you can find the details in the instructions. e.g., You may save "data_20_120.npy" and "label_20_120.npy" in the folder named "rahi".
  2. add the configs of your dataset into the file dataset/data_config.json. e.g., the dataset name should be "rahi_20_120" in dataset/data_config.json.
  3. add the dataset and config names at lines 393 and 394 in util.py. e.g., add the choice of "rahi" at line 393.
  4. run the scripts with your dataset. For example, command "pretrain.py v1 rahi 20_120 -s limu_v1".

Another way is to extract the core codes and replace the data, label with your own datasets. e.g., line 27 in pretrain.py.

Hope the two methods can help :-).

Thanks for your help. But in this way, the LIMU-Bert will be trained on my small data, from scratch, right? Is there a way to use transfer learning, e.g. using UCI? for example, after using pretrain.py for UCI, load the limu_v1.pt checkpoint and pretrain on my dataset.

Of course, you can. A simple way is to create a saved folder for your dataset saved/pretrain_base_rahi_20_120. Copy the pre-trained model uci.pt to the folder. Run embedding.py with embedding.py v1 rahi 20_120 -f uci, which generates representations for your dataset with uci.pt. Then run classifier.py as usual.