ZeroDivisionError in evaluation

Question

ZeroDivisionError in evaluation

Closed this issue a year ago · 20 comments

Hi,

Thank you for sharing this fantastic work. I am trying to reproduce some experiment results, and I follow the instruction to download all the data and use the provided pretrained weight. Here is an image regarding the error in the evaluation.

Could you help to view this problem and give me some hints to fix it?

By the way, complete mols' JS bond distances also show None, which I don't think it is correct.

Many thanks!

Answer 1 · 2023-04-30T02:30:56.000Z

Hi,

Thank you for your interest in this work! It seems the 'success_atom_types' counter in 'evaluate_diffusion.py' was not successfully updated. You can try commenting out the docking-related code between line 105 - 130. I guess some error was raised around here and triggered try - except: continue. (The exception handling I wrote here is brute)

Answer 2 · 2023-04-30T02:36:33.000Z

Thank you for your quick response. However, if I comment out 105-130 lines, the chem_results will remain undefined and cause an error in line 145. What should I do with the chem_results? Many thanks!

Answer 3 · 2023-04-30T02:46:41.000Z

I see. Please try run the script with "--docking_mode none". The docking-related code will not be run in this case.

Answer 4 · 2023-04-30T03:12:19.000Z

Thanks, setting "--docking_mode none" will not raise that error. But I am not able to align those experiment results to the tables and figures in ur paper well.

The first block should be the Table 1 experiment, but the ordering of rows seems to be different.

The second block is Table 3's experiment result, right?

The third block points to Table 2, if I am wrong about this one, please inform me of the correct table/figure in ur paper.

Many thanks!

Answer 5 · 2023-04-30T03:41:18.000Z

Right, these three parts correspond to the tables you mentioned. However, I noticed you evaluated "400 samples in total". Where do these samples come from? The results in table 1, 2, 3 are averaged over 100 datapoints in the test set and 100 samples for each datapoint, so there should be 10000 samples in total, which you can actually find here: https://drive.google.com/drive/u/1/folders/19imu-mlwrjnQhgbXpwsLgA17s1Rv70YS, where I provided all samples for baselines and targetdiff.

Answer 6 · 2023-04-30T03:45:23.000Z

Thanks, I will check the shared drive.

For the "400 samples in total", I follow the instruction in your repo and the Python command ranging '--data_id {i}' from 0 to 99.

python scripts/sample_diffusion.py configs/sampling.yml --data_id {i}

Answer 7 · 2023-04-30T04:02:21.000Z

Hello,

I may download the wrong lmdb and split files. I want to make sure 'crossdocked_v1.1_rmsd1.0.tar.gz' is the correct data I need to download, right?

Thanks

Answer 8 · 2023-04-30T04:18:11.000Z

If you don't need docking (which will use the original .pdb file), you only need to download the .lmdb file and split .pt file.

'crossdocked_v1.1_rmsd1.0.tar.gz' includes the original data (.pdb, .sdf files) from CrossDocked2020 with RMSD < 1A. You may not need that.

If you need to dock generated molecules, you may also need to download test_set.zip.

Answer 9 · 2023-04-30T19:09:02.000Z

If I need not use the 'crossdocked_v1.1_rmsd1.0.tar.gz', then the "400 samples in total" is the result of following your repo's instruction for sampling and evaluation... You mentioned in 'Data' instruction to download preprocessed .lmdb and split, but the drive only contains 'crossdocked_v1.1_rmsd1.0_pocket10_processed_final.lmdb'. I am not sure if that's the reason.

Answer 10 · 2023-04-30T19:37:50.000Z

The split file is "crossdocked_pocket10_pose_split.pt"

Answer 11 · 2023-04-30T19:39:14.000Z

Yes, I also have that one downloaded

Answer 12 · 2023-04-30T19:40:25.000Z

Oh I see, you also need to set "num_samples: 100" in the sampling.yml

Answer 13 · 2023-04-30T19:41:06.000Z

I will update the config file and instruction about data download

Answer 14 · 2023-05-24T03:34:57.000Z

Hello. It looks like this issue is still affecting my local copy of the repository, as when I try running python3 scripts/evaluate_diffusion.py sampling_outputs/ --docking_mode vina_dock --protein_root data/test_set, I encounter the same divide-by-zero error. This is currently preventing me from reproducing the results in your paper. May I ask if there is a simple way to address this bug in evaluation?

Answer 15 · 2023-05-30T18:46:40.000Z

Sorry for the late response. The reason for this issue should be that the Vina Docking related environment is not successfully set up. You can either set --docking_mode none or remove the try-except wrapper between line 105 to line 127 to see what happens on the Vina Docking. If you only want to reproduce the results in the paper, we just provided a jupyter notebook: notebooks/summary.ipynb. It can help you quickly reproduce the results.

Answer 16 · 2023-05-30T19:33:58.000Z

No problem. I found the cause of my problem was that I had not extracted test_set.zip, which led to my evaluation metrics not being able to be computed. This might have affected other people as well if they did not extract the ZIP archive beforehand.

Answer 17 · 2023-05-31T16:16:50.000Z

Thank you for letting me know! I have updated the README accordingly.

No problem. I found the cause of my problem was that I had not extracted test_set.zip, which led to my evaluation metrics not being able to be computed. This might have affected other people as well if they did not extract the ZIP archive beforehand.

Answer 18 · 2023-11-23T04:07:37.000Z

hi, @guanjq , thanks for your previous answer for this question. However, I still encounter the same problem in the link, and the link, after I tried all abovementioned method.
I have tried the following method:

set num_samples to 100;
set "docking_mode" to None. (No error occured, but I can't get complete results mentioned in table 3 of the paper)
comment the code between 105-130.(The code couldn't run successfully).
I was wondering how to solve it T.T

Answer 19 · 2024-05-17T04:00:13.000Z

hi, @guanjq , thanks for your previous answer for this question. However, I still encounter the same problem in the link, and the link, after I tried all abovementioned method. I have tried the following method:

set num_samples to 100;

set "docking_mode" to None. (No error occured, but I can't get complete results mentioned in table 3 of the paper)

comment the code between 105-130.(The code couldn't run successfully).
I was wondering how to solve it T.T

Maybe you need to use the version of the docking tool provided by the author, I also encountered the above problem, finally I updated the version provided by the author to solve the problem.

Answer 20 · 2024-05-17T04:05:20.000Z

@koalaaaaaaaaa