facebookresearch/PoseDiffusion

bad effect on RealEstate10k

yuyu0927 opened this issue · 7 comments

Hi,

I ran the demo.py (without GGS) on a serie of images (youtube link) coming from test dataset of RealEstate10k. Here is the result of the visualisation, which is not very good. I am not sure if I've done something wrong about visualisation code. Can you help me with this? Thanks in advance.
Weixin Image_20231010173357

jytime commented

Hi,

I think it is because the released ckpt was trained in Co3D. Although in most cases the model pre-trained in Co3D can generalize to Re10K, it will require GGS to make the results stable and accurate. In visualisation, the result will look bad if we have several "groups" of cameras, which seems to be the case you show here, i.e., the model believes some frames gather together while others gather aside. I would recommend to try the results with GGS, which will tell the model that these cameras should be together (ensure the frames feed into GGS is in high-resolution).

I plan to release the ckpt for Re10K in late November. Sorry I am a bit busy with another work currently.

Hi,

Thanks a lot for your reply. I still have two questions. One is what dose 'the model believes some frames gather together while others gather aside' mean? Is there any prior knowledge? Another one is which frame resolution fed into GGS do you recommend? Is 720p high enough?

look forward to your reply.

jytime commented

Hi,

There is no prior knowledge for this, but this is the hypothesis based on my "observations" in various samples. My feeling is, the model will tend to think some frames are quite close like a "group"/"cluster", probably due to their quite similar appearance. In some bad cases there are two, three or more groups. I am trying to analyse this in our future work.

I think 720p should be good. Usually it is best to extract the frames from videos by the original resolution.

Hi,

I really appreciate your help. I’ve learnt a lot. Your assumption about “group”/“cluster” sounds very interesting and attractive. Look forward to your future work.

kind regards,
Qingyu

Hi Jianyuan,

I fed the same data with high resolution to the model with GGS, but the results are very similar in both cases. It seems that GGS optimization is kind of limited for this type of data (RealEstate10k). I am not very sure, maybe I need to test more data. By the way, very much looking forward to CKPT for Re10K. Thanks!

Below are result without GGS and result with GGS, respectively.
image
image

Kind regards,
Qingyu

jytime commented

Hi Qingyu,

Thanks for your feedback. I will keep this issue open till release the Re10K ckpt.

Hey @yuyu0927 , it may be too late (sorry for the delay), I just uploaded a checkpoint for re10k