Question: How to get the embeddings of a list of SMILES strings

Question

Question: How to get the embeddings of a list of SMILES strings

phosseini opened this issue 5 years ago · 5 comments

I wonder if there's any easy way to get the embeddings of a set/list of SMILEs string using the pre-trained models? For example, let's say we have a list of SMILEs like [smile_1, smiles_2,...,smile_n] how can I get the corresponding embedding vectors of SMILES in this list using the pre-trained models?

Answer 1 · 2019-07-21T02:16:05.000Z

Thanks phosseini!
The pre-trained models receive both drugs (as graph representation for SMILES strings) and proteins (as sequences) as input and return the affinity of them. For what you ask, I guess any X2Vec models could help.

Answer 2 · 2019-07-25T03:11:08.000Z

Hi, a solid work! I wonder GraphDTA can be used for classification tasks such as DTI? Thank you

Answer 3 · 2019-07-25T14:49:21.000Z

Thanks phosseini!
The pre-trained models receive both drugs (as graph representation for SMILES strings) and proteins (as sequences) as input and return the affinity of them. For what you ask, I guess any X2Vec models could help.

Thanks. I think smile_to_graph is what I was looking for.

Answer 4 · 2020-02-26T00:31:03.000Z

Hi, a solid work! I wonder GraphDTA can be used for classification tasks such as DTI? Thank you

Sorry for missing this.
Yes it can. DTI (interactions) is binary, showing if a pair of drug/target interacts or not. While DTA (affinity) refers to the strength of the interaction. A tweak on the output of the model would do the task.

Answer 5 · 2023-11-07T18:30:03.000Z

Hi, I made a fork page.
All model structures are same, only classification task is added.
GraphDTA model works with classification tasks well, too!
Hope that this would help someone who want to do classification with graphDTA

https://github.com/DBpackage/GraphDTA-DTI.git