JohannesBuchner/BXA

FileExistsError while using mpi4py

Closed this issue · 5 comments

Hello,

I have encountered the following error while running the BXA script using mpiexec -np 2 python3 script.py.

Traceback (most recent call last):
  File "BXA_borus02D_FF_noSz_allsets.py", line 73, in <module>
    solver = bxa.BXASolver(transformations=priors, outputfiles_basename=outputfiles_basename)
  File "/home/dl/.local/lib/python3.8/site-packages/bxa/xspec/solver.py", line 132, in __init__
    os.mkdir(outputfiles_basename)
FileExistsError: [Errno 17] File exists: 'BXA_borus02D_FF_noSz_allsets/'

The script is an unmodified version of what I am able to run without MPI. I have tried deleting the directory and starting over again but the issue persists. Is there a fix for this?

Following are the version numbers of various tools.
BXA: 4.0.5
mpiexec: 4.0.3
OS: Ubuntu 20.04

Please let me know if you need any other information.

Thank you in advance!

Do you have mpi4py installed?

I see the problem. Each MPI process is trying to call mkdir, and only one can succeed.

I think the lines 131-132 https://github.com/JohannesBuchner/BXA/blob/master/bxa/xspec/solver.py#L131 should be deleted, as ultranest will create these directories with process 0.

Thanks for your prompt reply. Do you think it is safe to remove these lines (and create the directory manually before running the script)?

Yes, either of the two should do it.

I see the problem. Each MPI process is trying to call mkdir, and only one can succeed.

I think the lines 131-132 https://github.com/JohannesBuchner/BXA/blob/master/bxa/xspec/solver.py#L131 should be deleted, as ultranest will create these directories with process 0.

Removing these lines throws the following error

Traceback (most recent call last):
  File "/home/dl/.local/lib/python3.8/site-packages/h5py/_hl/files.py", line 185, in make_fid
    fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "BXA_borus02D_FF_noSz_allsets.py", line 77, in <module>
    results = solver.run(frac_remain=0.5, max_num_improvement_loops=0, resume=True)
  File "/home/dl/.local/lib/python3.8/site-packages/bxa/xspec/solver.py", line 188, in run
    self.results = solve(
  File "/home/dl/.local/lib/python3.8/site-packages/ultranest/solvecompat.py", line 47, in pymultinest_solve_compat
    sampler = ReactiveNestedSampler(
  File "/home/dl/.local/lib/python3.8/site-packages/ultranest/integrator.py", line 1074, in __init__
    self.pointstore = HDF5PointStore(storage_filename, storage_num_cols, mode='a' if resume else 'w')
  File "/home/dl/.local/lib/python3.8/site-packages/ultranest/store.py", line 186, in __init__
    self.fileobj = h5py.File(filepath, **h5_file_args)
  File "/home/dl/.local/lib/python3.8/site-packages/h5py/_hl/files.py", line 406, in __init__
    fid = make_fid(name, mode, userblock_size,
  File "/home/dl/.local/lib/python3.8/site-packages/h5py/_hl/files.py", line 187, in make_fid
    fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 108, in h5py.h5f.create
OSError: Unable to create file (unable to open file: name = 'BXA_borus02D_FF_noSz_allsets/results/points.hdf5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)

However, I was able to solve the problem by reinstalling mpi4py (thanks to your other comment). Interestingly, the code without removing lines 131-132 also works now.