dapowan/LIMU-BERT-Public

Questions of Embeddings from LIMU-BERT Transformer

Closed this issue ยท 3 comments

Dear @dapowan,

I was trying to understand how the general representations from a pretrained LIMU-BERT transformer is generated. I've got some clarification questions. It would be great if you can further clarify them.


In the script embedding.py, by running the default commands provided in the README.md file, I see that in

mask_seqs, seqs = batch

mask_seqs and seqs are in shape of (128, 120, 6) (batch_size, seq_len, feature_num) and (128, 120, 2), respectively.
Q1: What are seqs specifically here? It's not clear to me what is the last dimension (, , 2) represents.


Q2: Is the output in

output = trainer.run(func_forward, None, data_loader, args.pretrain_model)
exactly the general representation from a pretrained LIMU-BERT transformer mentioned in the paper, whose dimension is ? Specifically
return h
?

Thanks in advance!

Hi Bryanbo,
Q1: sorry for the wrong names of the variables, they should be seqs, label = batch. The seqs are IMU sequences whereas the label denotes the labels for the sequence. I have updated the codes.
Q2: yes, it has a shape of (batch_size, seq_len, h_dim), e.g., (128,120,72).

Hi @dapowan,
No worries! Thanks for the reply and it makes sense now. So we feed the normalized sequences without masking into LIMU-BERT transformer to generate the 72 dimensional general representations.

I noticed that the last dimension of label is 2, but it's not clear to me why is that the case. It would be great if you can further clarify that. So during testing we have one dimensional vector label_test for action labels in classifier.py and classifier_bert.py, which makes sense. I am trying to understand the last 2 in label here.

It is related to the dataset you are using. You can check the details of the four datasets in data_config.json. If you set the UCI dataset, the labels have two dimensions, which corresponds to activity and user. For other dataset, such as Shoaib, you will have labels with the last dimension of 3 (activity, position, user).