OpenBioSim/biosimspace

[BUG] Unable to save Mol2 files through BioSimSpace

lohedges opened this issue · 6 comments

For some reason I am no longer able to save mol2 format files through BioSimSpace. Saving directly via Sire works. For example:

In [1]: import BioSimSpace as BSS

INFO:numexpr.utils:Note: NumExpr detected 20 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
INFO:numexpr.utils:NumExpr defaulting to 8 threads.

In [2]: from sire.legacy.IO import Mol2

In [3]: s = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["ala.top", "ala.crd"]))
Downloading from 'https://biosimspace.openbiosim.org/m/ala.top'...
Unzipping '/tmp/tmp0hp57dw0/ala.top.bz2'...
Downloading from 'https://biosimspace.openbiosim.org/m/ala.crd'...
Unzipping '/tmp/tmp0hp57dw0/ala.crd.bz2'...

In [4]: BSS.setVerbose(True)

In [5]: BSS.IO.saveMolecules("test", s[0], "mol2")
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/lester/Code/openbiosim/biosimspace/python/BioSimSpace/IO/_io.py:841 in saveMolecules       │
│                                                                                                  │
│    838 │   │   │   │   _os.rename(file, new_file)                                                │
│    839 │   │   │   │   file = [new_file]                                                         │
│    840 │   │   │   else:                                                                         │
│ ❱  841 │   │   │   │   file = _SireIO.MoleculeParser.save(                                       │
│    842 │   │   │   │   │   system._sire_object, filebase, _property_map                          │
│    843 │   │   │   │   )                                                                         │
│    844                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: SireError::invalid_cast: Cannot cast from an object of class "SireBase::GeneralUnitProperty" to an object of class
"SireIO::MoleculeParser". (call sire.error.get_last_error_details() for more info)

The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ /home/lester/Code/openbiosim/biosimspace/python/BioSimSpace/IO/_io.py:856 in saveMolecules       │
│                                                                                                  │
│    853 │   │   except Exception as e:                                                            │
│    854 │   │   │   msg = "Failed to save system to format: '%s'" % format                        │
│    855 │   │   │   if _isVerbose():                                                              │
│ ❱  856 │   │   │   │   raise IOError(msg) from e                                                 │
│    857 │   │   │   else:                                                                         │
│    858 │   │   │   │   raise IOError(msg) from None                                              │
│    859                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: Failed to save system to format: 'mol2'

In [6]: mol2 = Mol2(s[0].toSystem()._sire_object)

In [7]: mol2.write_to_file("test.mol2")
Out[7]: ['test.mol2']

This is because the filebase parsed to MoleculeParser.save appears to be missing the .mol2 extension.

I see the issue. The filebase is passed without the extension for all formats. We actually specify the format via the property_map option to MoleculeParser.save and are using (for legacy reason) sire.legacy.Base.wrap to pass dictionary values. This isn't working for mol2 as it is trying to convert to a general unit, i.e.:

In [1]: import sire as sr

In [2]: from sire.legacy.Base import wrap

In [3]: from sire.legacy.IO import MoleculeParser

In [4]: mols = sr.load_test_files("ala.crd", "ala.top")

In [5]: MoleculeParser.save(mols._system, "test", {"fileformat": "mol2"})
Out[5]: ['/home/lester/Code/openbiosim/biosimspace/demo/test.mol2']

In [6]: MoleculeParser.save(mols._system, "test", {"fileformat": wrap("mol2")})
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: SireError::invalid_cast: Cannot cast from an object of class "SireBase::GeneralUnitProperty" to an object of class
"SireIO::MoleculeParser". (call sire.error.get_last_error_details() for more info)

I'll try removing wrap as it looks like it's no longer needed for string types.

I'm also amazed that we've not come across this before.

Wow - that is obscure. I'll need to think about how the auto string to unit conversion could skip "mol2" in this case (as context is clearly not moles squared...)

Do you see the same thing with "sr.save()"?

No, I think sr.save is okay (will double check tomorrow). It took me a while to figure out what was going on since I'd completely forgotten that I'd hacked the format via the property map, rather than specifying via an extension.

Just to confirm that sire.save is fine. The issue occurs exclusively when using sire.legacy.Base.wrap with string properties in the map, i.e.:

In [1]: import sire as sr

In [2]: mols = sr.load_test_files("ala.crd", "ala.top")

In [3]: sr.save(mols, "test.mol2")
Out[3]: ['/home/lester/Code/openbiosim/biosimspace/demo/test.mol2']

In [4]: sr.save(mols, "test", map={"fileformat": "mol2"})
Out[4]: ['/home/lester/Code/openbiosim/biosimspace/demo/test.mol2']

In [5]: sr.save(mols, "test", map={"fileformat": sr.legacy.Base.wrap("mol2")})
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ /home/lester/.conda/envs/openbiosim/lib/python3.12/site-packages/sire/_load.py:636 in save       │
│                                                                                                  │
│   633 │                                                                                          │
│   634molecules = _to_legacy_system(molecules)                                               │
│   635 │                                                                                          │
│ ❱ 636return MoleculeParser.save(molecules, filename, map=map)                               │
│   637                                                                                            │
│   638                                                                                            │
│   639 def load_test_files(files: _Union[_List[str], str], *args, map=None):                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: SireError::invalid_cast: Cannot cast from an object of class "SireBase::GeneralUnitProperty" to an object of class
"SireIO::MoleculeParser". (call sire.error.get_last_error_details() for more info)