The bug happens when loading the DUD-E and scPDB dataset
StefanIsSmart opened this issue · 2 comments
Describe the bug
The bug happens when loading the DUD-E and scPDB dataset
To Reproduce
Steps to reproduce the behavior:
- Just run the demo from your websites
from tdc.generation import SBDD
data = SBDD(name='dude')
Expected behavior
Get the data object
Screenshots
Found local copy for 1/2 file...
Found local copy for 2/2 file...
Done!
Processing (this may take long)...
100%|██████████| 102/102 [07:33<00:00, 4.44s/it]
processing done, 0/40490 fails
ValueError Traceback (most recent call last)
/export/disk1/why/database/PL_interaciton_dataset/script/tmp.ipynb 单元格 1 in ()
1 from tdc.generation import SBDD
----> 2 data = SBDD(name='dude')
File /export/disk3/why/software/Anaconda/conda/envs/RDKit/lib/python3.8/site-packages/tdc/generation/sbdd.py:44, in SBDD.init(self, name, path, print_stats, return_pocket, threshold, remove_protein_Hs, remove_ligand_Hs, keep_het, save)
42 protein, ligand = bi_distribution_dataset_load(name, path, multiple_molecule_dataset_names, return_pocket, threshold, remove_protein_Hs, remove_ligand_Hs, keep_het)
43 if save:
---> 44 np.savez(os.path.join(path, name + '.npz'),
45 protein_coord=protein['coord'],
46 protein_atom=protein['atom_type'],
47 ligand_coord=ligand['coord'],
48 ligand_atom=ligand['atom_type'],
49 )
50 self.save = save
52 self.ligand = ligand
File <array_function internals>:200, in savez(*args, **kwargs)
File /export/disk3/why/software/Anaconda/conda/envs/RDKit/lib/python3.8/site-packages/numpy/lib/npyio.py:615, in savez(file, *args, **kwds)
531 @array_function_dispatch(_savez_dispatcher)
532 def savez(file, *args, **kwds):
533 """Save several arrays into a single file in uncompressed .npz
format.
534
535 Provide arrays as keyword arguments to store them under the
(...)
613
614 """
--> 615 _savez(file, args, kwds, False)
File /export/disk3/why/software/Anaconda/conda/envs/RDKit/lib/python3.8/site-packages/numpy/lib/npyio.py:716, in _savez(file, args, kwds, compress, allow_pickle, pickle_kwargs)
714 for key, val in namedict.items():
715 fname = key + '.npy'
--> 716 val = np.asanyarray(val)
717 # always force zip64, gh-10776
718 with zipf.open(fname, 'w', force_zip64=True) as fid:
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (40592,) + inhomogeneous part.
Environment:
- OS: Linux
- Python version: 3.8.13
- TDC version: 0.3.8
- Any other relevant information: None
Additional context
None.
Thanks for raising this issue! @yuanqidu could you help take a look - thanks!