BioPandas/biopandas

Merging HETATM and ATOM entries into one DataFrame

rasbt opened this issue · 0 comments

rasbt commented

Following up on the comment by @wojdyr in #52

by the way, having atoms in two separate frames is rather not a good idea.

At first glance it may look like the protein chains are all ATOM, but wwPDB uses different criterium:

only natural amino-acids (and nucleic acids) are marked as ATOM, and the modified ones are > > > marked as HETATM.
So MET is ATOM but MSE is HETATM.

If you keep them both separately such an example:

  ppdb.df['ATOM']['b_factor'].plot(kind='hist')

won't work as expected - it may skip some residues

That's a good point and I haven't thought of that! The reason why I kept these separate is that I am mostly working on cases where HETATMs refer to non-protein residues. The HETATM--MSE issue should definitely be addressed somehow and I would have to think about it more ... Suggestions would be welcome.