running TorchANI in parallel
qzhu2017 opened this issue · 4 comments
Hi, not sure if this is the right place to ask. I am trying to use the trained ani model to optimize many molecules or crystals in the parallel mode based on python's multiprocess as follows
from ase.lattice.cubic import Diamond
from ase.optimize import BFGS
from ase.calculators.emt import EMT
import torchani
import multiprocessing as mp
from functools import partial
import warnings
warnings.simplefilter("ignore")
def opt(struc, calculator, steps):
struc.set_calculator(calculator)
opt = BFGS(struc, logfile='ase.log')
opt.run(fmax=0.001, steps=steps)
print(struc.get_potential_energy())
strucs = []
strucs.append(Diamond(symbol="C", pbc=True))
strucs.append(Diamond(symbol="C", pbc=True))
# Both EMT and AN1ccx work, but ANI2x fails
#calc = EMT()
calc = torchani.models.ANI1ccx().ase()
#calc = torchani.models.ANI2x().ase()
with mp.Pool(2) as p:
func = partial(opt, calculator=calc, steps=10)
p.map(func, strucs)
This above code works well if I use ANIccx as the calculator. But it failed when I used AN12x. It complained that too many files were opened.
KeyboardInterrupt
Traceback (most recent call last):
File "/scratch/qzhu/miniconda3/lib/python3.8/multiprocessing/util.py", line 300, in _run_finalizers
finalizer()
File "/scratch/qzhu/miniconda3/lib/python3.8/multiprocessing/util.py", line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/scratch/qzhu/miniconda3/lib/python3.8/multiprocessing/util.py", line 133, in _remove_temp_dir
rmtree(tempdir)
File "/scratch/qzhu/miniconda3/lib/python3.8/shutil.py", line 715, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/scratch/qzhu/miniconda3/lib/python3.8/shutil.py", line 628, in _rmtree_safe_fd
onerror(os.scandir, path, sys.exc_info())
File "/scratch/qzhu/miniconda3/lib/python3.8/shutil.py", line 624, in _rmtree_safe_fd
with os.scandir(topfd) as scandir_it:
OSError: [Errno 24] Too many open files: '/tmp/pymp-pwvnbvp6'
Is it possible that some files were not closed when calling the ANI2x models?
Qiang, nice to hear from you! We never tried to run it in parallel with CPUs. I will check. You will have better success with GPUs. The pytorch code is not very well optimized for the CPU runs. Your best bet would be just running a few ANI scripts in parallel.
On a side note, don't expect a good performance of these public models on crystals:-)
@isayev Thanks for you quick reply. I need to process many many small molecules/crystals. GPU is probably not useful in this case (if I understand it correctly). The optimized geometries will be passed to my other calculators. So splitting them to a few separate ANI runs is not really convenient. The model won't be perfect, I am curious how it is compared to the generic force fields.
Anyway, I appreciate if you can take a look.
Hi, could you try whether this work?
from ase.lattice.cubic import Diamond
from ase.optimize import BFGS
from ase.calculators.emt import EMT
from ase.build import molecule
import torchani
import multiprocessing as mp
from functools import partial
import warnings
warnings.simplefilter("ignore")
def opt(struc, calculator, steps):
struc.set_calculator(calculator)
opt = BFGS(struc, logfile='ase.log')
opt.run(fmax=0.001, steps=steps)
print(struc.get_potential_energy())
strucs = []
for i in range(10):
strucs.append(Diamond(symbol="C", pbc=True))
strucs.append(molecule('CH4'))
# Both EMT and AN1ccx work, but ANI2x fails
#calc = EMT()
#calc = torchani.models.ANI1ccx().ase()
calc = torchani.models.ANI2x().ase()
processes = []
for struc in strucs:
p = mp.Process(target=opt, args=(struc, calc, 10))
p.start()
processes.append(p)
for p in processes:
p.join()
Thank you. Your script works well!