GPTFF (Graph-based Pretrained Transformer Force Field) can simulate arbitrary inorganic systems with good precision and generalizability.
Using conda
to create a new python virtual env(not necessary):
conda create -n gptff python=3.8
Then clone the GPTFF
repo and install:
git clone https://github.com/atomly-materials-research-lab/GPTFF.git
cd GPTFF
pip install .
Fast Energy(eV), Force(eV/Å), Stress(GPa) calculation:
from gptff.model.mpredict import ASECalculator
from pymatgen.core import Structure
from pymatgen.io.ase import AseAtomsAdaptor
model_weight = "pretrained/gptff_v1.pth"
device = 'cuda' # or cpu
p = ASECalculator(model_weight, device) # Initialize the model and load weights
adp = AseAtomsAdaptor()
struc = Structure.from_file('POSCAR_structure')
atoms = adp.get_atoms(struc)
atoms.set_calculator(p)
energy = atoms.get_potential_energy() # unit (eV)
forces = atoms.get_forces() # unit (eV/Å)
stress = atoms.get_stress() # unit (GPa)
Structure Optimization:
Lattice vectors would be changed
from gptff.model.mpredict import ASECalculator
from pymatgen.core import Structure
from pymatgen.io.ase import AseAtomsAdaptor
from ase.optimize.fire import FIRE
from ase.constraints import ExpCellFilter, StrainFilter
model_weight = "pretrained/gptff_v1.pth"
device = 'cuda' # or cpu
p = ASECalculator(model_weight, device) # Initialize the model and load weights
struc = Structure.from_file('POSCAR_structure') # Read structure
adp = AseAtomsAdaptor()
atoms = adp.get_atoms(struc)
atoms.set_calculator(p)
optimizer = ExpCellFilter(atoms)
FIRE(optimizer).run(fmax=0.01, steps=100)
Lattice vectors would be not change, only atomic positions would be optimized
from gptff.model.mpredict import ASECalculator
from pymatgen.core import Structure
from pymatgen.io.ase import AseAtomsAdaptor
from ase.optimize.fire import FIRE
from ase.optimize.bfgs import BFGS
model_weight = "pretrained/gptff_v1.pth"
device = 'cuda' # or cpu
p = ASECalculator(model_weight, device) # Initialize the model and load weights
struc = Structure.from_file('POSCAR_structure') # Read structure
adp = AseAtomsAdaptor()
atoms = adp.get_atoms(struc)
atoms.set_calculator(p)
optimizer = BFGS(atoms)
optimizer.run(fmax=0.01, steps=1000)
Molecular dynamics (ASE):
We will support LAMMPS
with GPTFF
later.
from gptff.model.mpredict import ASECalculator
from pymatgen.core import Structure
from pymatgen.io.ase import AseAtomsAdaptor
from ase import Atoms, units
from ase.md.nvtberendsen import NVTBerendsen
import os
model_weight = "pretrained/gptff_v1.pth"
device = 'cuda' # or cpu
p = ASECalculator(model_weight, device) # Initialize the model and load weights
struc = Structure.from_file('POSCAR_structure') # Read structure
adp = AseAtomsAdaptor()
atoms = adp.get_atoms(struc)
atoms.set_calculator(p)
save_dir = './results_path'
os.makedirs(save_dir, exist_ok=True)
temp = 430 # unit (K)
dyn = NVTBerendsen(atoms=atoms,
timestep=2 * units.fs,
temperature=temp, # unit (K)
taut=200*units.fs,
loginterval=20, # Save md information and trajectory every 20 steps
logfile=os.path.join(save_dir, f'output.txt'), # Information printer
trajectory=os.path.join(save_dir, f'Li3PO4_nvt_out_{temp}K.trj'), # Trajectory recorder
append_trajectory=True)
dyn.run(100000)
config.json
would be training parameters, you could specify data path in this file.
gptff_trainer config.json
If you want to pretrain or finetune the force field based on your own dataset, you can prepare your own dataset as below:
The dataset must be store in .csv
format file, there are several columns:
struct_id
: Unique structure id, e.g. 0, 1, 2, ..
energy
: Total energy of the structure (eV)
forces
: The forces of each atom (eV/Å)
stress
: The stress of the structure (kBar, align with VASP stress output directly)
structure
: dict format of the structure.
from pymatgen.core import Structure
struc = Structure.from_file('POSCAR')
struc_data = struc.as_dict()
fold
: You can specify which fold to be trained and which fold to be validated. If you set fold in config.json is 0
, the the fold !=0
is training dataset, fold == 0
would be validation dataset.
ref_energy
: Reference energy of the structure,
For example, the formula of the structure is Li2O4, the ref_energy of Li2O4 is: atom_refs[3] * 2 + atom_refs[8] * 4. 3
and 8
are atomic order of Li and O, 2
and 4
are atom numbers in the structure.
In the model we have pretrained, the atom_refs
is:
atom_refs = np.array([
0.00000000e+00, -3.46535853e+00, -7.56101906e-01, -3.46224791e+00,
-4.77600176e+00, -8.03619240e+00, -8.40374071e+00, -7.76814618e+00,
-7.38918302e+00, -4.94725878e+00, -2.92883670e-02, -2.47830716e+00,
-2.02015956e+00, -5.15479820e+00, -7.91209653e+00, -6.91345095e+00,
-4.62278149e+00, -3.01552069e+00, -6.27971322e-02, -2.31732442e+00,
-4.75968073e+00, -8.17421803e+00, -1.14207788e+01, -8.92294483e+00,
-8.48981509e+00, -8.16635547e+00, -6.58248850e+00, -5.26139665e+00,
-4.48412068e+00, -3.27367370e+00, -1.34976438e+00, -3.62637456e+00,
-4.67270042e+00, -4.13166577e+00, -3.67546394e+00, -2.80302539e+00,
6.47272418e+00, -2.24681188e+00, -4.25110577e+00, -1.02452951e+01,
-1.16658385e+01, -1.18015760e+01, -8.65537518e+00, -9.36409198e+00,
-7.57165084e+00, -5.69907599e+00, -4.97159232e+00, -1.88700594e+00,
-6.79483530e-01, -2.74880153e+00, -3.79441765e+00, -3.38825264e+00,
-2.55867271e+00, -1.96213610e+00, 9.97909972e+00, -2.55677995e+00,
-4.88030347e+00, -8.86033743e+00, -9.05368602e+00, -7.94309693e+00,
-8.12585485e+00, -6.31826210e+00, -8.30242223e+00, -1.22893251e+01,
-1.73097460e+01, -7.55105974e+00, -8.19580521e+00, -8.34926874e+00,
-7.25911206e+00, -8.41697224e+00, -3.38725429e+00, -7.68222088e+00,
-1.26297007e+01, -1.36257602e+01, -9.52985029e+00, -1.18396814e+01,
-9.79914325e+00, -7.55608603e+00, -5.46902454e+00, -2.65092136e+00,
4.17472161e-01, -2.32548971e+00, -3.48299933e+00, -3.18067109e+00,
3.57605604e-15, 9.96350211e-16, 1.18278079e-15, -1.44201673e-15,
-6.73760309e-18, -5.48347781e+00, -1.03346396e+01, -1.11296117e+01,
-1.43116273e+01, -1.47003999e+01, -1.54726487e+01])
Or you can fit you own atom_refs
.
The file config.json
includes training settings,
- workers: The number of workers for dataloader
- epochs: The number of training epochs
- batch_size: batch size for training, the number of structures used in one step(batch)
- node_feature_len: The size of the node(atom) feature length
- edge_feature_len: The size of the edge(bond) feature length
- n_layers: THe number of layers of GPTFF model
- device:
cpu
orcuda
- val_fold: Label validation data during training
- transformer_activate: If activate
transformer
block or not - weight_energy: Weight factor of the energy
- weight_force: Weight factor of the forces
- weight_stress: Weight factor of the stress, if there's not stress data, please set it to
0
If you found GPTFF useful, please cite our article:
@article{XIE2024,
title = {GPTFF: A high-accuracy out-of-the-box universal AI force field for arbitrary inorganic materials},
journal = {Science Bulletin},
year = {2024},
issn = {2095-9273},
doi = {https://doi.org/10.1016/j.scib.2024.08.039},
url = {https://www.sciencedirect.com/science/article/pii/S2095927324006327},
author = {Fankai Xie and Tenglong Lu and Sheng Meng and Miao Liu},
keywords = {Data Science, Molecular Dynamics, Graph Neural Network, Universal Fore Field},
}