Xinyu-Yi/TransPose

Finetuning, problem reproducing results for TotalCapture

PuckelTrick opened this issue · 5 comments

Hi,

I am able to reproduce your results for the DIP-IMU Dataset after finetuning, but am far off on the TotalCapture dataset. What did you do to achieve those results. For finetuning I tried different learning rates (1e-3, 1e-4, 1e-5) and early stopping with patience 0 to 3. The data is processed with your scripts and the Total Capture data is the version from the DIP authors.
To put it in numbers for the SIP Error / angular Error i even beat your numbers from the paper for DIP-IMU but always get something in the range 25 (SIP error) / 15 (ang error) for Total Capture.

So how did you finetune your model to achieve your results?

I remember that I just used a lower learning rate and trained the network for several epochs.

As for finetuning, there's one point I want to confirm.

While the three stages are pre-trained separately, do you finetune them together with only the 6d rotation loss (a.k.a. formula no.3 in the paper)? Since the DIP-IMU gt doesn't include joint positions.

@Junlin-Yin Hello, may I ask if the fine tuning on dip data set means that dip is divided into two parts, one for training and the other for testing? Specifically, s_09 and s_10 are used for testing

As for finetuning, there's one point I want to confirm.

While the three stages are pre-trained separately, do you finetune them together with only the 6d rotation loss (a.k.a. formula no.3 in the paper)? Since the DIP-IMU gt doesn't include joint positions.

I fine-tune them separately. All the three networks use root-centered coordinate frame. No translation is needed here.

@Junlin-Yin Hello, may I ask if the fine tuning on dip data set means that dip is divided into two parts, one for training and the other for testing? Specifically, s_09 and s_10 are used for testing

Maybe s08 is used for validation.