yoshida-lab/XenonPy

'NGram' object has no attribute '_sample_order'

rnaimehaom opened this issue · 14 comments

Hello,

I am learning this tutorial:
https://github.com/yoshida-lab/XenonPy/blob/master/samples/iQSPR.ipynb

When I run the following code, I got the error, but I cannot figure out.
Both 'ngram_pubchem_ikebata_reO15_O10.obj' and 'ngram_pubchem_ikebata_reO15_O11to20.obj' were downloaded according to the tutorial.

Ok to run

with open('ngram_pubchem_ikebata_reO15_O10.obj', 'rb') as f:
n_gram = pk.load(f)

Ok to run

with open('ngram_pubchem_ikebata_reO15_O11to20.obj', 'rb') as f:
n_gram2 = pk.load(f)

Error here

n_gram.merge_table(n_gram2)

###################

The error output

\envs\xenonpy\lib\site-packages\xenonpy\inverse\iqspr\modifier.py", line 96, in sample_order
return self._sample_order

AttributeError: 'NGram' object has no attribute '_sample_order'
###################

Can you help to solve this?

Thanks,
Ming

@rnaimehaom Thank you for point this issue out. Actually, we are planning to renew the tutorial as it is quite outdated. For now, please manually add the attribute using the following code:
"name of the NGram object"._sample_order = (0, 20)

Let me know if you still run into problem after adding this line of code to your program.

@stewu5 Thank you.

Not sure if what I did in the correct way:

#############################
n_gram._sample_order = (0, 20)
n_gram2._sample_order = (0, 20)
n_gram.merge_table(n_gram2)
#############################

Then I got the error:

lib\site-packages\xenonpy\inverse\iqspr\modifier.py", line 127, in min_len
return self._min_len

AttributeError: 'NGram' object has no attribute '_min_len'

@rnaimehaom sorry for the trouble... I thought I have updated the NGram object already...
Anyway, to fix the problem for now, add the corresponding line(s) below when an error pops up for missing a specific variable:
n_gram._sample_order=(1, 10)
n_gram._del_range=(1, 10)
n_gram._min_len=1
n_gram._max_len=1000

Just to give you an explanation why this is happening, the NGram object you downloaded (for some unknown reason) is created with an older version of xenonpy that does not provide default values for some of the internal variables. This has been fixed in the latest xenonpy version, so if you create/train a new NGram object yourself, you will not face these errors.
Let me know if anything weird is still happening.

@stewu5 Thank you so much. As you said, I may need re-generate the file in order to avoid these kinds of error. In fact after using your way, error still exists. So I will get new files instead of using the old files.

Still have error after including those four lines?
Would you mind posting the error?

We are going to check all the old tutorials later. Hopefully we can help you through before the next update is available.

Hi @stewu5 ,

Below is the error information for your reference.

#########################################################
import pickle as pk

with open('c:/pk/ngram_pubchem_ikebata_reO15_O10.obj', 'rb') as f:
n_gram = pk.load(f)

with open('c:/pk/ngram_pubchem_ikebata_reO15_O11to20.obj', 'rb') as f:
n_gram2 = pk.load(f)

n_gram._sample_order = (1, 10)
n_gram._del_range = (1, 10)
n_gram._min_len = 1
n_gram._max_len = 1000

n_gram.merge_table(n_gram2)
#############################################################

Traceback (most recent call last):

envs\xenonpy\lib\site-packages\xenonpy\inverse\iqspr\modifier.py", line 116, in reorder_prob
return self._reorder_prob

AttributeError: 'NGram' object has no attribute '_reorder_prob'

@rnaimehaom I planned to update the tutorial and the ngram files by the end of this week. Please try again then. Sorry for the trouble and the wait.

@stewu5 Thank you so much. I look forward to learning your updated tutorials.

@rnaimehaom We have updated the tutorial and the n_gram files. Please go to the samples folder and get the latest iQSPR.ipynb file as the new version of the tutorial. Let me know if you still run into problems.

@stewu5 Hi Stepen, it works very well for now. Thanks so much! Sorry about for late response but I just finished to run the iQSPR.ipynb.

I have other questions:

  1. Can iQSPR be used to inverse design for large [small molecule] including e.g. 60-80 even more heavy atoms?
  2. Can iQSPR be used to inverse design for large [small molecule] based on only 2D structural information because my small molecules have very flexible structures which are hard to get stable 3D conformation?
  3. What is the compatible configuration for XenonPy among Python version, PyTorch verion, Cudatoolkit version and cudnn version? For now, I just used the cpu version based on the cpu.yml file in the folder: XenonPy\conda_env\ and the tool was installed and tested on Windows 10. It works very well. If I want to test the GPU version, which yml file should I use? I check your website: https://xenonpy.readthedocs.io/en/latest/installation.html#using-conda-and-pip, it gives 'conda env update -n xenonpy -f cuda101.yml', but in the XenonPy\conda_env folder, it only includes: cuda102.yml and cuda113.yml. Can you help with it. Thanks so much!

@rnaimehaom Good to know that everything works well now.
Regarding your questions:

  1. It is possible to do so, but since current implementation of iQSPR is not efficient in terms of the N-gram table storage, it would very hard for you to train and store an NGram with high order (e.g., something higher than 40). As a result, you may not be able to get an effective generator as it is not considering long enough substring. We had some trials to generate such large molecules but sometimes it will give weird structures, especially when the training set molecules have overly diverse structures.
  2. iQSPR is actually using only 2D structural information because SMILES does not contain much 3D information.
  3. For this, I will let another XenonPy developer to answer your question. @TsumiNa

@stewu5 Thanks very much. Hopefully you can develop/expand XenonPy to handle large [small molecules] in the future, which will be very helpful.

@rnaimehaom
For 3. cuda101.yml has been removed because Pytorch dropped the support for CUDA 10.1. If you have CUDA 10 on your machine, please confirm its version is greater than 10.2, or I recommend you update the CUDA to 11.3. All these .yml files should work with python 3.7~3.9. So if you use windows with the latest drive version of NVIDIA, I think the cuda113.yml should fit you well.

@TsumiNa Thanks so much for your help. I will give a try that.