This project performs high-throughput force field simulation and data processing
This project depends on AIMS_Tools
Two main scripts accomplish the jobs submit.py and monitor.py.
-
Clone AIMS_Simu and AIMS_Tools:
It is suggested to put them in same folder.git clone https://github.com/sungroup-sjtu/AIMS_Simu git clone https://github.com/sungroup-sjtu/AIMS_Tools
-
Modify
config.py
a) Set paths of MS_TOOLS, WORK_DIR, PACKMOL, DFF and DFF database.
b) Select the job queue (fast, gtx, cpu on Cluster 86)
c) Set GROMACS executableFor example:
force field setting(no default setting): DFF_TABLE = 'MGI' # 'IL' paths setting: MS_TOOLS_DIR = os.path.join(CWD, '..', 'AIMS_Tools') # AIMS_Simu and AIMS_Tools in same folder WORK_DIR = os.path.join(CWD, 'SimulationData') # all simulation data is saved in a new folder AIMS_Simu/SimulationData PACKMOL_BIN = '/share/apps/tools/packmol' DFF_ROOT = '/share/workspace/xiangyan/src/DFF/Developing' # simulation paramters come from this folder PBS settings: ... PBS_ARGS = ('gtx', 32, 2, 16) # partition, cpu, gpu, cpu_request ... GMX_BIN = '/share/apps/gromacs/2018.6/bin/gmx_serial' GMX_MDRUN = 'gmx_gpu mdrun' # GMX_MDRUN= 'gmx_fast mdrun' GMX_MULTI = True GMX_MULTI_NJOB = 8 # Use -multidir function of GROMACS. For Npt simulation, set it to 8. For NvtSlab simulation, 4 is better GMX_MULTI_NOMP = None # Set the OpenMP threads. When set to None, use only one node and the best number of threads is automatically determined simulation details settings (default setting is OK): NATOMS = 3000 # least number of atoms build in simulation box. NMOLS = 120 # least number of molecules build in simulation box. LJ96 = False # using LJ 9-6 non-bonded potential DIFF_GK = False # using green-kubo method to calculate the diffusion constant. (Expensive, not suggest) DEBUG = False # if true: do not delete the trajectory file in analyze process. class NvtMultiConfig(Config, SunRunConfig, SunExtendConfig, SunBugFixConfig): REPEAT_NUMBER = 80 # set the number of parallel simulation for nvt-multi
-
To set up a high-throughput computation. Prepare a list of molecules in /mols/ and then:
the example.txt contains 5 columns: name SMILES molecular_ratio t_list p_list
more than 5 points for t_list and p_list is needed, otherwise some analysis and dumps scripts will not work.
run/submit.py -p [npt,nvt-slab] -i mols/example.txt -r 'comments' -tp assigned
-
To run the calculations:
run/monitor.py -p [npt, nvt-slab]
Available procedures: npt, nvt-slab, ppm(npt), nvt-multi(npt).
ppm(npt) means npt is prerequisite of ppm.
Example for npt procedure:cd run ./submit.py -p npt -i mols/example.txt -r testing -tp assigned ./monitor.py -p npt
-
The results are saved in the WORK_DIR. In default, WORK_DIR=AIMS_Simu/SimulationData.
QM calculation for heat capacity is performed by run-cv.py
script
- Prepare QM files. This will check database to remove duplicated molecules from
example.txt
and process the molecule name. A file named_cv_prepared.txt
will be generated and used for following steps
./run-cv.py prepare mols/example.txt
- Generate Gauss input files and submit to PBS job manager
./run-cv.py cv _cv_prepared.txt
- Analyze Gauss results. The results will be saved in a file named
_cv.log
./run-cv.py get-cv _cv_prepared.txt
- Save results into database
./run-cv.py save-db
For more information, see our publication: "Predicting Thermodynamic Properties of Alkanes by High-throughput Force Field Simulation and Machine Learning", https://doi.org/10.1021/acs.jcim.8b00407
Several scripts are provided for post-processing and analysing the simulation data. They are located at scripts
and scripts-post
- Fitting the simulation data at different temperature and pressure. So that properties and derivatives at arbitrary T or P can be obtained.
This should be performed prior to any other analyzing
./scripts/post-process.py -p [npt, nvt-slab] -o True
- Remark molecules containing specific groups (e.g. halide, cyclo-ester) as
bad
molecules, which will not be dumped in following steps
./scripts/remark.py [npt, nvt-slab]
- Dump the molecules from sqlite database to
mols.csv
file. The category should be specified, which is necessary for uploading toAIMS_Web
database
./scripts/dummp-mols.py [small molecule, ionic liquid, ...]
- Dump the simulation data from sqlite database to
csv
file, which can be uploaded intoAIMS_Web
database
./scripts/dummp-data-npt.py
./scripts/dummp-data-nvt-slab.py
- You select specific class of molecules in following analysis based on force field atom type, by modify the app/selection.py.
- Compare with NIST Experimental data
- Make sure that
nist.sqlite
exists indatabase
folder - Run following script to compare simulation and expt data and plot the results
cd scrips-post python3 compare_detail.py -p npt -t nist --selection False python3 compare-nist.py -p npt --selection False
- Make sure that
- Compare with ILTHERMO Experimental data
- Make sure that
ilthermo.sqlite
exists indatabase
folder - Run following script to compare simulation and expt data and plot the results
cd scrips-post python3 compare_detail.py -p npt -t ilthermo --selection False python3 compare-nist-il.py -p npt --selection False
- Make sure that