facebookresearch/PoseDiffusion

Thank you for providing such an excellent project, I would like to ask if it is possible to test it with my own dataset

wahiahia opened this issue · 6 comments

Thank you for providing such an excellent project, I would like to ask if it is possible to test it with my own dataset

This means that there are any requirements for image size or sharpness, resolution, etc.?

Hi @wahiahia ,

Sure, you can run the model on your own dataset following the practice as in load_img_folder.py

There is no strict requirement on image size or sharpness. The current model was trained by center-cropping and resizing the input images to a shape of 224x224. The implementation be found in load_img_folder.py. In our experiments, it works well with the input image size varies from 64 to 3096.

One aspect you should pay attention to is the camera distribution. The presently released Co3D model was trained using the Co3D v2 dataset, where the cameras are usually positioned in a fly-around arrangement. The model can still work in alternative scenarios like fly-through, but you might notice a performance drop.

Please feel free to let me know if you meet any problem when using the model. We will also release the code of fine-tuning so you can train it on your own data then.

Thanks for your reply, I'm trying to restore the head camera pose, I took it works great between adjacent images but if the picture spans a little bit it doesn't work so well, I wonder if it's my head picture The extracted sift key points are not enough

Hi @wahiahia ,

I guess the number of keypoints may not be very important, while, you may get a better result with hloc instead of sift. And, to understand the situation better, could you describe what does "picture spans a little bit" mean?

What I mean is the time span, that is, when I have a device to acquire, maybe the first frame and the second frame, the third frame will match better, but when I rotate and translate, the first frame and the later frame registration is not very good.

Hi @wahiahia ,

Considering that the model is trained on Co3D, which does not incorporate extreme camera positions, its performance might degrade if the camera is excessively shifted upwards or downwards.