/pdbmol

Spec-compliant PDB(x)/mmCIF read/write library

Primary LanguageRustApache License 2.0Apache-2.0

PDBMol

Josh Mitchell 2024-

GitHub last commit GitHub Actions Workflow Status Crates.io Version docs.rs

PDB files are a ubiquitous format for sharing biomolecular structure information. They are used by both experimental and computational scientists for distribution, cross-software interchange, and storage. As the fixed-width limitations of PDBs make extending them increasingly problematic, PDBx/mmCIF is often recommended as a successor format, though it has not yet achieved the same popularity. Unfortunately, most software that parses or writes PDBs is hand-rolled and often produces subtly different and sometimes mutually incompatible files:

  • Parsers typically only support a small selection of known residue types, rather than the entire CCD.
  • Writers often use custom or even arbitrarily generated atom names, breaking bond inference
  • Parsers typically infer bonds from atomic positions rather than atom and residue names and CONECT records, resulting in incorrect bonding in strained conformations
  • Parsers sometimes consider atom records with the same identifiers to be duplicates, and sometimes to be separate atoms

The objective of PDBMol is to provide a spec-compliant PDB(x)/mmCIF reader/writer library that can be adopted by other projects to provide consistent PDB file handling. A secondary objective is to provide compatibility with common existing PDB dialects. To those ends, our specific feature goals are:

  • To provide a spec-compliant PDB parser
  • To provide a spec-compliant CIF parser
  • To provide a spec-compliant PDBx/mmCIF parser
  • To interpret all standard residue and atom names according to the CCD
  • To provide APIs to process PDB(x)/mmCIF files at the record level
  • To provide APIs to process PDB(x)/mmCIF files at the molecular graph level (with coordinates and bonds)
  • To provide clear, descriptive errors when a PDB(x)/mmCIF file does not comply with the spec
  • To provide limited configuration support for alternate PDB dialects when requested
  • To provide Rust, Python, and C bindings for the above
  • 🦀⚡🦀 Blazingly Fast 🦀⚡🦀

Non-goals include:

  • Automagic loading of noncompliant PDB files

Installation

Cargo

  • Install the rust toolchain in order to have cargo installed by following this guide.
  • run cargo install pdbmol

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

See CONTRIBUTING.md.