THUNLP-MT/dyMEAN

Revised codes for no toxin chains, Manual epitope design

Opened this issue · 3 comments

Excellent work, I tried to generate antibody for COVID NP protein.

So, I revised the code as below;

if name == 'main':
ckpt = './checkpoints/cdrh3_design.ckpt'
root_dir = './demos/data'
pdbs = [os.path.join(root_dir, '6wzo.pdb') for _ in range(4)]
toxin_chains = []
remove_chains = None
receptor_chains = ["A", "B", "C", "D"]
epitope_defs = [os.path.join(root_dir, c + '_epitope.json') for c in receptor_chains]
identifiers = [f'{c}_antibody' for c in receptor_chains]

I manually design epitope.json file as;

A_epitope.json
[
["A", [299, ""]],
["A", [300, ""]],
["A", [302, ""]],
["A", [303, ""]],
["A", [305, ""]],
["A", [306, ""]],
["A", [345, ""]],
["A", [347, ""]],
["A", [348, ""]],
["A", [349, ""]]
]
However, I encounter the messeges as;

/home/jhs9301/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3464: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/home/jhs9301/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/numpy/core/_methods.py:184: RuntimeWarning: invalid value encountered in divide
ret = um.true_divide(
/mnt/c/Users/jhs93/Downloads/dyMEAN-main/dyMEAN-main/data/dataset.py:63: RuntimeWarning: invalid value encountered in cast
X[0] = center # set center

I guess the invalid values were obtained in Epitope data.

Epitope data: {'X': array([[[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808]]]), 'S': [22], 'residue_pos': [0], 'xloss_mask': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]}

Then, How can i fix the problem? please let me know.

kxz18 commented

Hi, thanks for your interest in our work. Could you try replacing the insert code with a space instead of an empty string? The code base is using biopython to parse the PDB file, and the biopython package uses a space to denote an empty insert code. This mismatch might be the cause of the problem.

Thanks! It was fixed. there is one more question. Is there any way to obtain the complex structure of an antibody sequence that I already know and an antigen? I wanna optimize the already developed antibody but there is no complex structure.

kxz18 commented

Hi, I've updated the API for structure prediction here. Corresponding explanations are also updated in README. You can have a try!