joblib/joblib

Result order other than input order when using the threading backend and pre_dispatch

classner opened this issue · 3 comments

Dear joblib-team,

I recently tracked down an issue in my code to joblib. I am using the threading-backend together with the pre_dispatch option. In this case, it occasionally happens that the returned result list contains results in a different order than the inputs!

I was able to reproduce the issue on a regular basis with the following code:

from joblib import Parallel, delayed
import numpy as np
if __name__ == '__main__':
  NUM = range(1000)
  EXPECTED = [np.sqrt(x) for x in NUM]
  for it in range(100):
    rnum = Parallel(n_jobs=10, backend='threading', pre_dispatch='2*n_jobs')(
           delayed(np.sqrt)(x) for x in NUM)
    if not (rnum == EXPECTED):
      zped = zip(rnum, EXPECTED)
      print 'Discrepancy in iteration %d' % (it)
      print [(x, ex) for (x, ex) in zped if x != ex]
      break
    else:
      print '.'

It takes the square root of a range of numbers using joblib, and repeats this 100 times. If it detects a discrepancy to the expected result, it outputs tuples of (observed, expected). The code is protected by the __main__ guard so that you can easily replace threading- and multiprocessing-backend. The issue only occurs for me if I use the threading backend together with the pre_dispatch statement.

The program usually exits after one of the first iterations in that case. For example:

Discrepancy in iteration 1
[(3.3166247903553998, 3.1622776601683795), (3.1622776601683795, 3.3166247903553998)]

I only observed two swapped values so far.

Thanks for the report, and importantly, the test case. We'll try to look at that soon.

What is the status of this issue as of today? I'm using joblib 0.13.2 and seem to be experiencing this problem as well.

I simply run the test code above, with version 1.1.0, it seems the problem is solved.