Problems encountered during training and evaluation

Question

Problems encountered during training and evaluation

DonlynLee opened this issue a year ago · 3 comments

Sorry to interrupt you again. When evaluating the performance on BEHAVE with provided model, I found that the gif (correction) seems like the same as the gif without correction, and sometimes the evaluation metrics went into NaN. I also tried to train interaction diffusion on the BEHAVE, after 6 or 7 epochs, the problem is "NaN or Inf found in input tensor". I lowered the learning rate, it was helpful that more epochs are done, but then problem comes again. Is it due to insufficient GPU performance ? Or maybe due to the different sampled point clouds when generating contact labels? Thanks very much.

Answer 1 · 2023-11-14T23:01:59.000Z

Hi,

You're welcome. If any questions arise, please don't hesitate to reach out:)

I didn't find any NaN during my evaluation. I will quickly check the training procedure.

The correction will not be executed for every interaction sequence. While these corrections may seem minor in the context of short-term predictions, it will be significant in the long-term generation. I will make this part of code available after cvpr ddl.

Also, I will upload the point clouds from my side shortly. Previous I was just concerned about the large file size.

Best

Answer 2 · 2023-11-15T04:58:03.000Z

Really thanks for your reply, I will try again. Wish everything goes well !

Answer 3 · 2023-12-21T03:38:09.000Z

Hi,

You can try our pretrained weights with our processed data from this link. Sorry for the delay.

Best