smiles724/Molformer

type of dist_bar in tr_msa

Closed this issue · 5 comments

Hi,

I have difficulties with launching build_model function from tr_msa module. Can you explain what is dist_bar parameter meaning and how it should be constructed?

Supposing it is a list of distances between atoms in pos from your example, I do as follows:

def dist(l, r):
  return ((l[0]-r[0])**2 + (l[1]-r[1])**2 + (l[2]-r[2])**2)**0.5

dst = [[dist(l, r) for r in data] for l in data]

, where data is 4x3 List[List[float]] of coordinates from pos tensor.

Then:

from model.tr_msa import build_model

model = build_model(N, n, dst).cuda()

The model returned accept third argument dist as tensor, not list, so I do as follows:

tens_dist = torch.tensor(dst).cuda()
out = model(x, mask, tens_dist)

, where x and mask as in the example code.
The model computation fails with error:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [8, 1, 1, 1], but got 3-dimensional input of size [4, 1, 4] instead

Hi, thanks for your usage of our model. The dist_bar hyperparameter just corresponds to the multiple scales in multi-scale self-attention. You can just provide the model with a dist_bar like [1, 3, 5]. Then you create three local scales with 1, 3, and 5 distance units. Finally, these three local features are fused with the global feature to be fed into MLP.

Thank you for answer.

Well, I get the model with some hyperparameter:

from model.tr_msa import build_model
model = build_model(N, n, [1, 3, 5]).cuda()

What input parameter should be here?

out = model(x, mask, ???)

I have to pass atoms' coordinates information to the model, but not pos from example nor tens_dist I built above are acceptable.

Hi, I updated the usage guidance of how to employ the multi-scale self-attentions (It is my bad to update it so late)! You just need to calculate the distance matrix of all atoms inside each molecule by dist = torch.cdist(pos, pos).float(). Then feed it as one of the model inputs.

I hope you find Molfromer useful.

Thank you so much, now I am a bit more familiar with tr_msa!

I have one more question, this time about tr_afps/tr_all.
The model seems to receive same arguments as tr_msa, but I get an error when try to execute code.
Screenshot 2022-01-31 at 12 15 43

The interpreter complains that self_attn, which is forwarded to MultiHeadedAttention, returns value that cannot be unpacked into att_out and scores. And apparently the value returned from MultiHeadedAttention is not supposed to be unpacked:

class MultiHeadedAttention(nn.Module):
    def __init__(self, h, embed_dim, dropout=0.1):
        ...
        self.linears = clones(nn.Linear(embed_dim, embed_dim), 4)
        ...

    def forward(self, query, key, value, dist_conv, mask=None):
        ...
        return self.linears[-1](x)

Could this be a bug in the code or am I using it incorrectly?

Hi, you are right. There is a bug when I previously combined all these different models together in a neat style. Originally, I asked the MultiHeadedAttention to return both the new values of x and the attention scores. However, in the later version, I deleted it without considering the compatibility.

The problem can be simply solved by changing to the following lines:

att_out = self.self_attn(self.norm(x), self.norm(x), self.norm(x), dist, mask)

scores = self.self_attn.attn

I update the AFPS model, and you can use the latest version. If anything wrong happens, please do not hesitate to pull a new issue.