Parallelization on Linux gets stuck

Question

Parallelization on Linux gets stuck

pratikgandhic1 opened this issue a year ago · 11 comments

Summary:

Parallelization on Linux gets stuck when running models utilizing cmdstan!

Description:

I am currently using cmdstan to run models in parallel. However when I try to set n_jobs = -1 the job gets stuck and cannot proceed further. This works fine when I sat n_jobs=1 which tells it has to do something with parallelization. Could someone assist me with this.

Additional Information:

Provide any additional information here.

Current Version:

I am currently using 0.9.77 version. I tried using 1.0.0 version as well but had no luck with that either!

Answer 1 · 2023-07-25T18:15:31.000Z

What version of cmdstan are you using?

Can you share more of the code you are using (e.g. number of changes, status of STAN_THREADS, etc)?

Answer 2 · 2023-07-25T18:17:39.000Z

I am using 0.9.77 version. How do I get the information you are asking?

Answer 3 · 2023-07-25T18:25:52.000Z

0.9.77 is the version of cmdstanpy. CmdStan is a separate piece of software (which can itself be installed with cmdstanpy.install_cmdstan) which has versions that look like (for example) 2.23.1

If you are able to share the code you are using to compile and run your model (not necessarily the stan model code itself) that may be helpful

Answer 4 · 2023-07-25T18:37:01.000Z

Ok gotcha.

So I am installing Prophet and cmdstanpy. I am installing cmdstan like this:

    cmdstan_dir = os.path.join(sys.exec_prefix, "bin", "cmdstan")
    log.info(f"Installing cmdstan at {cmdstan_dir}")
    cmdstanpy.install_cmdstan(dir=cmdstan_dir)

And then:

import Prophet

model = Prophet()
model.fit(X,ts)

Let me know if this helps or you would need more info.

Answer 5 · 2023-07-25T19:00:11.000Z

Did you open an issue against Prophet?

Answer 6 · 2023-07-25T19:16:18.000Z

No I haven't. We are seeing the same behavior with one of our own model which utilizes cmdstanpy. So I am thinking this is something related to cmdstan and functioning with multiprocessing specifically on Linux.

Answer 7 · 2023-07-25T19:18:22.000Z

If you are able to share an example that fails specifically we may be able to assist, but it is difficult to help with this in general. I use Linux personally and often run multiple chains in parallel

Answer 8 · 2023-07-25T19:23:51.000Z

How do you set multiple chains? Do you have an example of it? We are using multiprocessing as a wrapper to run everything in parallel.

Answer 9 · 2023-07-25T19:28:43.000Z

If you want to run multiple chains of a given model with the same data, this can be done quite easily using the num_chains and parallel_chains arguments:
https://cmdstanpy.readthedocs.io/en/v1.1.0/users-guide/examples/MCMC%20Sampling.html#Parallelization-via-multi-threaded-processing

If your goal is to run multiple models, or multiple datasets with the same model, this is also possible, but it is not built in to CmdStanPy. Using multiprocessing should work just fine on recent versions of cmdstanpy, but on older versions you may have issues with multiple copies overwriting each other's output if you do not manually specify non-overlapping output directories