type of dist_bar in tr_msa
Closed this issue · 5 comments
Hi,
I have difficulties with launching build_model
function from tr_msa
module. Can you explain what is dist_bar
parameter meaning and how it should be constructed?
Supposing it is a list of distances between atoms in pos
from your example, I do as follows:
def dist(l, r):
return ((l[0]-r[0])**2 + (l[1]-r[1])**2 + (l[2]-r[2])**2)**0.5
dst = [[dist(l, r) for r in data] for l in data]
, where data
is 4x3 List[List[float]]
of coordinates from pos
tensor.
Then:
from model.tr_msa import build_model
model = build_model(N, n, dst).cuda()
The model
returned accept third argument dist
as tensor, not list, so I do as follows:
tens_dist = torch.tensor(dst).cuda()
out = model(x, mask, tens_dist)
, where x
and mask
as in the example code.
The model
computation fails with error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [8, 1, 1, 1], but got 3-dimensional input of size [4, 1, 4] instead
Hi, thanks for your usage of our model. The dist_bar hyperparameter just corresponds to the multiple scales in multi-scale self-attention. You can just provide the model with a dist_bar like [1, 3, 5]. Then you create three local scales with 1, 3, and 5 distance units. Finally, these three local features are fused with the global feature to be fed into MLP.
Thank you for answer.
Well, I get the model with some hyperparameter:
from model.tr_msa import build_model
model = build_model(N, n, [1, 3, 5]).cuda()
What input parameter should be here?
out = model(x, mask, ???)
I have to pass atoms' coordinates information to the model, but not pos
from example nor tens_dist
I built above are acceptable.
Hi, I updated the usage guidance of how to employ the multi-scale self-attentions (It is my bad to update it so late)! You just need to calculate the distance matrix of all atoms inside each molecule by dist = torch.cdist(pos, pos).float()
. Then feed it as one of the model inputs.
I hope you find Molfromer useful.
Thank you so much, now I am a bit more familiar with tr_msa
!
I have one more question, this time about tr_afps/tr_all
.
The model seems to receive same arguments as tr_msa
, but I get an error when try to execute code.
The interpreter complains that self_attn
, which is forwarded to MultiHeadedAttention
, returns value that cannot be unpacked into att_out
and scores
. And apparently the value returned from MultiHeadedAttention
is not supposed to be unpacked:
class MultiHeadedAttention(nn.Module):
def __init__(self, h, embed_dim, dropout=0.1):
...
self.linears = clones(nn.Linear(embed_dim, embed_dim), 4)
...
def forward(self, query, key, value, dist_conv, mask=None):
...
return self.linears[-1](x)
Could this be a bug in the code or am I using it incorrectly?
Hi, you are right. There is a bug when I previously combined all these different models together in a neat style. Originally, I asked the MultiHeadedAttention to return both the new values of x and the attention scores. However, in the later version, I deleted it without considering the compatibility.
The problem can be simply solved by changing to the following lines:
att_out = self.self_attn(self.norm(x), self.norm(x), self.norm(x), dist, mask)
scores = self.self_attn.attn
I update the AFPS model, and you can use the latest version. If anything wrong happens, please do not hesitate to pull a new issue.