DirtyHarryLYL/HAKE-Action-Torch

how to produce embedding by bert

mayuelala opened this issue · 3 comments

您好,我在复现您的工作的时候,有一个问题,关于使用bert 如何生成embedding

  1. 您使用bert产生的embedding 是1536维的,官方默认给定的bert是768维的,您是用什么方式扩展维度的呢?是使用fc层?
  1. 输入到 verb对应的object时,您是直接使用的object?还是用something代替的呢?

3.您在代码中使用的语义特征维度是1536,但是您PaStaNet论文中使用的维度是2304,请问应该是哪个呢

hwfan commented

The relation between dim 768, 1536 and 2304

  1. For the original implementation of PaStaNet, we jointly process a <human-object-interaction> triplet, for each element of this triplet we extract its word embedding from BERT. That's to say, for an HOI triplet the dimension number is 768 x 3 = 2304.
  2. In the implementation of A2V we simplify the model to <human-interaction> for adaptation to more general action understanding scenarios. Therefore the dimension number of word embedding is 768 x 2 = 1536.
  3. For the word embedding of interacted object, please refer to the implementation of HAKE-Action for more details.