brjathu/PHALP

Question about relational transformer model

xiexh20 opened this issue · 2 comments

Dear authors,

Thanks for the great work and releasing the code! I have a few questions regarding the transformer design choices:

  1. Why is there a division by 10 in this code? It looks like a normalization factor, but I did not find it in the original relational model implementation. Can you explain how you set this value?
  2. In you previous work T3DP, you use a vanilla transformer that compute attention directly, while here you use a more advanced relational model. Is there a specific consideration for this design change? Do you have any experiments that show current design is better?
    Thank you very much for your time!

Best,
Xianghui

Hi Xianghui,

Thanks for you interest in our work. For the first question, we set this value empirically, to change the pose by only a small amount. For the second question, this was a change due to legacy code and we indeed didn’t observe a performance boost from that change.

Thanks,
Jathushan

thanks a lot for your help!