theislab/chemCPA

how was lincs_trapnell.smiles generated?

Closed this issue · 1 comments

shouldn't this come out of lincs_full_smiles_sciplex_genes.h5ad?

we are looking up RDKIT embedding by matching the smiles index, but when there are discrepancies between lincs_trapnell.smiles and lincs_full_smiles_sciplex_genes.h5ad, we don't know why.

I recreated linacs_trapnell.smiles using the drug_names_to_once_canon_smiles() method in data.py. Finally exp. init_drug_embedding and exp.train runs through and smiles list lined up.