type of dist_bar in tr_msa

Question

type of dist_bar in tr_msa

Closed this issue 3 years ago · 5 comments

Hi,

I have difficulties with launching build_model function from tr_msa module. Can you explain what is dist_bar parameter meaning and how it should be constructed?

Supposing it is a list of distances between atoms in pos from your example, I do as follows:

def dist(l, r):
  return ((l[0]-r[0])**2 + (l[1]-r[1])**2 + (l[2]-r[2])**2)**0.5

dst = [[dist(l, r) for r in data] for l in data]

, where data is 4x3 List[List[float]] of coordinates from pos tensor.

Then:

from model.tr_msa import build_model

model = build_model(N, n, dst).cuda()

The model returned accept third argument dist as tensor, not list, so I do as follows:

tens_dist = torch.tensor(dst).cuda()
out = model(x, mask, tens_dist)

, where x and mask as in the example code.
The model computation fails with error:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [8, 1, 1, 1], but got 3-dimensional input of size [4, 1, 4] instead

Answer 1 · 2022-01-28T07:24:22.000Z

Hi, thanks for your usage of our model. The dist_bar hyperparameter just corresponds to the multiple scales in multi-scale self-attention. You can just provide the model with a dist_bar like [1, 3, 5]. Then you create three local scales with 1, 3, and 5 distance units. Finally, these three local features are fused with the global feature to be fed into MLP.

Answer 2 · 2022-01-28T08:09:11.000Z

Thank you for answer.

Well, I get the model with some hyperparameter:

from model.tr_msa import build_model
model = build_model(N, n, [1, 3, 5]).cuda()

What input parameter should be here?

out = model(x, mask, ???)

I have to pass atoms' coordinates information to the model, but not pos from example nor tens_dist I built above are acceptable.

Answer 3 · 2022-01-29T02:02:22.000Z

Hi, I updated the usage guidance of how to employ the multi-scale self-attentions (It is my bad to update it so late)! You just need to calculate the distance matrix of all atoms inside each molecule by dist = torch.cdist(pos, pos).float(). Then feed it as one of the model inputs.

I hope you find Molfromer useful.

Answer 4 · 2022-01-31T09:36:52.000Z

Thank you so much, now I am a bit more familiar with tr_msa!

I have one more question, this time about tr_afps/tr_all.
The model seems to receive same arguments as tr_msa, but I get an error when try to execute code.

The interpreter complains that self_attn, which is forwarded to MultiHeadedAttention, returns value that cannot be unpacked into att_out and scores. And apparently the value returned from MultiHeadedAttention is not supposed to be unpacked:

class MultiHeadedAttention(nn.Module):
    def __init__(self, h, embed_dim, dropout=0.1):
        ...
        self.linears = clones(nn.Linear(embed_dim, embed_dim), 4)
        ...

    def forward(self, query, key, value, dist_conv, mask=None):
        ...
        return self.linears[-1](x)

Could this be a bug in the code or am I using it incorrectly?

Answer 5 · 2022-02-01T06:11:07.000Z

Hi, you are right. There is a bug when I previously combined all these different models together in a neat style. Originally, I asked the MultiHeadedAttention to return both the new values of x and the attention scores. However, in the later version, I deleted it without considering the compatibility.

The problem can be simply solved by changing to the following lines:

att_out = self.self_attn(self.norm(x), self.norm(x), self.norm(x), dist, mask)

scores = self.self_attn.attn

I update the AFPS model, and you can use the latest version. If anything wrong happens, please do not hesitate to pull a new issue.