how to select len_train

Hello, I noticed in the cfg file that you set len_train to 16384. How did you arrive at this number? Was it obtained by dividing the total number of training images by the maximum image count?

Hi,

It is a not important hyperparameter. len_train only controls how long an epoch would be. I chose it because it will make each epoch take around 30 mins in my machine. You can set any reasonable numbers.

I used this method on two datasets, but the network cannot achieve accurate results, with an accuracy of less than 10%. Can you speculate on what the reasons might be? Can you also tell me if there are any specific aspects to pay attention to during model training? Additionally, is the results of this method closely related to the datasets? I also noticed that the focal length and principal point are normalized to [-1, 1]. For the focal length, should it be divided by half of the image size? Thank you for your response.

Hi,

Can you first check if you can reproduce our result on co3d with our pretrained model? This will ensure your env is correct.
When you refer to accuracy, is it mAUC or something?
It is hard to guess the reason without looking at your data. Which kind of images are they?
The model should generalize to normal images well. Please first ensure that the pretrained model works well and then try your own trained model.
Regarding focal length normalization, you can follow the code here

PoseDiffusion/pose_diffusion/datasets/re10k.py

Line 267 in 36eeb16

# PT3D FL PP
If you need a high accuracy, I would suggest to try our follow-up work, https://github.com/facebookresearch/vggsfm