benchopt/benchmark_lasso

Bug / Crash for Spam

Opened this issue · 5 comments

Currently the code on main for spam leads to crashes:

image

@gdurif any idea why and when this started?

Yes, the problem is the conda package python-spams (which I am not maintaining), potentially this one conda-forge/python-spams-feedstock#67

@mathurinm reported this here getspams/spams-python#17 and we are trying to solve the problem here #66

sorry for the duplicate then from #66
I am closing this issue.

We can keep this issue open until the problem is solved (since #66 is a pull request and it is possible to miss it while looking to report the problem)

OK. I am reopenning it until proper résolution.

Not sure if this is related but now, running a benchmark with spams leads to the following error message on my machine:

exception calling callback for <Future at 0x7eff7e2c6e80 state=finished raised TerminatedWorkerError>
Traceback (most recent call last):
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 359, in __call__
    self.parallel.dispatch_next()
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 794, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 779, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 531, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/reusable_executor.py", line 177, in submit
    return super(_ReusablePoolExecutor, self).submit(
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 1115, in submit
    raise self._flags.broken
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {SIGSEGV(-11)}
Traceback (most recent call last):
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/bin/benchopt", line 33, in <module>
    sys.exit(load_entry_point('benchopt', 'console_scripts', 'benchopt')())
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/jsalmon/Documents/OpenSource/benchOpt/benchopt/cli/main.py", line 199, in run
    run_benchmark(
  File "/home/jsalmon/Documents/OpenSource/benchOpt/benchopt/runner.py", line 302, in run_benchmark
    results = Parallel(n_jobs=n_jobs)(
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 1056, in __call__
    self.retrieve()
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 935, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result
    return future.result(timeout=timeout)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/concurrent/futures/_base.py", line 444, in result
    return self.__get_result()
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 359, in __call__
    self.parallel.dispatch_next()
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 794, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 779, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 531, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/reusable_executor.py", line 177, in submit
    return super(_ReusablePoolExecutor, self).submit(
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 1115, in submit
    raise self._flags.broken
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {SIGSEGV(-11)}