Problem running the inference using .csv file as input
Closed this issue · 2 comments
starwingc commented
Hi, the example csv file can no longer used. I can't figure out how should I do this.
,complex_name,protein_path,protein_sequence,ligand_description
0,5R7Y,data/5R7Y.pdb,None,data/TC5.sdf
1,5R7Z,data/5R7Z.pdb,None,data/KD7.sdf
2,5R84,data/5R84.pdb,None,data/NA0.sdf
3,5REC,data/5REC.pdb,None,data/ME8.sdf
and the error is
Traceback (most recent call last):
File "/home/tur54445/work/anaconda3_2023/envs/diffdock-gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/tur54445/work/anaconda3_2023/envs/diffdock-gpu/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/gpfs/work/tur54445/git/DiffDock/inference.py", line 131, in <module>
test_dataset = InferenceDataset(out_dir=args.out_dir, complex_names=complex_name_list, protein_files=protein_path_list,
File "/gpfs/work/tur54445/git/DiffDock/utils/inference_utils.py", line 157, in __init__
s = protein_sequences[i].split(':')
how should I format my csv? using the pdb file and sdf file for inference.
prathithbhargav commented
You'll probably have to add the sequence manually in order for it to work. As far as I understand, it needs the sequences in the csv file to compute the ESM Embeddings
starwingc commented
Thank you!