Stream support for exporting pdbs not working with OTHERS record
Closed this issue · 1 comments
Describe the bug
When trying to export pdb data with ATOM and OTHERS entries using .to_pdb_stream
I always get a pandas.errors.IntCastingNaNError
(cf. Steps/Code to Reproduce).
As I need to maintain the TER markers in the resulting pdb data, the content of the OTHERS frame is necessary.
When writing directly to a pdb file with .to_pdb
there is no such issue. A possible approach in fixing could be an abstract base function for both methods or to specify the desired output (i.e. file or stream) in to_pdb
as mentioned in #108
Steps/Code to Reproduce
Example:
from biopandas.pdb import PandasPdb
pdb_df = PandasPdb().fetch_pdb('1ou5')
out_string = pdb_df.to_pdb_stream(records=('ATOM', 'OTHERS'))
Expected Results
Stream containing the specified records in pdb format.
Actual Results
A pandas.errors.IntCastingNaNError
stemming from Line 909 in pandas_pdb.py
df.residue_number = df.residue_number.astype(int)
which is executed on the entire concatenated DataFrame.
As the OTHERS frame doesn't contain residue number entries, these cells are always NaN after concatenating.
Versions
biopandas 0.5.0dev
Linux-5.4.0-91-generic-x86_64-with-glibc2.31
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
Scikit-learn 1.3.0
NumPy 1.23.5
SciPy 1.11.1