The PMP format for Delaney logS set contains following information:
- Smiles and Exp. logS for 1128 molecules
HF/6-31G*
optimized structureHF/6-31G*
based RESP charge- Molecular ELectrostatic Solvation Energy and Atomic ELectrostatic Solvation Energy (MIBPB, RESP charge, mbondi radius)
- Dipole and Quadrapole based on GDMA analysis (H radius at default 0.325)
- Molecular Area/Volume (0.2 grid size, 1.4 probe radius)
- Atomic Area
The exact features containing now:
- SMILES (s) : origin smiles from origin data
- ExpLogS (f) : origin Experimental LogS value
- MolVolume (f) : Molecular Volume
- MolArea (f) : Molecular Surface Area
- ElecSolvEng (f) : Electrostatic Solvation Energy based on MIBPB5
- Dipole (f) : Molecular Dipole from GDMA result
- Quadrupole (f) : Molecular Quadrupole from GDMA result
- Area_* (f) : Surface Area of given element in molecule
- AtomArea (10f) : Atomic surface area
- AtomSolEng (10f) : Atomic Solvation Energy
- Dipole (10f) : Atomic dipole
- Quadrupole (10f) : Atomic quadrupole
- resp (10f) : RESP atomic charge
- mbondi (10f) : MBONDI atomic radiis
- AT_gaff (6s) : Atom type based on GAFF definition
- AT_sybyl (6s) : Atom type based on SYBYL definition
import PMPformat as PMP
- Open PMP molecule file
pmpf = PMP.PMPFormator(pmpfile="Data/Delaney_1.pmp")
- Get the molecule.
You can replace "MolFromSmiles" to this function to load molecule.
mol = pmpf.GetMol()
- Get All Properties/Features on the Molecular Level
mol_prop = mol.GetPropsAsDict()
print mol_prop
- Get a certain feature by name.
Recommend use the
ExpLogS/MoleArea/Volume/ElecSolvEng/Dipole
and so on.
I don't recommend to use element based area, such asArea_F
,Area_S
et al. But you can also try.
mol_prop['MolArea']
An alternative method is to use GetProp
, GetIntProp
or GetDoubleProp
methods to obtain properties when you exactly know the data type for the proporty.
mol_area = mol.GetDoubleProp('MolArea')
- Get Property/Feature on the Atom Level
The atom type
AT_gaff
,AT_sybyl
somehow can be represented by NGF and it's string based data. If you want to use it, you can use int value to represent it.
atom0 = mol.GetAtomWithIdx(0)
atom_prop = atom0.GetPropsAsDict()
print atom_prop
atom_prop['AtomArea']
## or using exactly method
features = [
atom0.GetProps('AT_gaff'), atom0.GetDoubleProps('AtomArea')
]