Parallel map reduce on SearchForest

Question

Parallel map reduce on SearchForest

hivert opened this issue 13 years ago · 145 comments

Implement a map reduce algorithm in parallel on large sets described by a SearchForest. We use a work-stealing algorithm (see https://en.wikipedia.org/wiki/Work_stealing) based on Python's multiprocessing.

CC: @sagetrac-sage-combinat @nathanncohen @jdemeyer @seblabbe

Component: combinatorics

Keywords: map-reduce, days57, days77

Author: Florent Hivert, Jean-Baptiste Priez, Nathann Cohen

Branch: 134c1fa

Reviewer: Sébastien Labbé, Jean-Baptiste Priez

Issue created by migration from https://trac.sagemath.org/ticket/13580

Answer 1 · 2013-02-09T18:12:42.000Z

comment:1

Salut Florent et Nathann,

I am starting to think/work on #6637... I have some questions. Mainly, I would like to know what is this ticket, because the above one liner in the description does not say much...

Will SearchForest survive or not?
Does it replace SearchForest?
Does it improve SearchForest?
Does it use SearchForest?
Or is it used by SearchForest?

Also, Florent wrote on sage-devel in October 2012 that

   I'm also in the process of finalizing a patch which do parallel and even
   distributed map-reduce on recursively enumerated sets (currently badly named
   SearchForest, I'll change the name in my patch, ticket #13580, patch on
   Sage-Combinat queue [1]).

I do not see in the cited patch that the name of SearchForest is changed.
What would be a better name for SearchForest ?
What patch is the more recent? the "old" one or the "experimental" one?

    trac_13580-map_reduce-fh.patch #+experimental
    map_reduce_improved_loop-fh.patch #+experimental
    map_reduce_condition-fh.patch #+experimental
    trac_13580-map_reduce-old-fh.patch

Answer 2 · 2014-04-07T21:54:44.000Z

Branch: u/hivert/13580/map_reduce

Answer 3 · 2014-04-09T02:56:21.000Z

New commits:

`693c672`	`Imported code from trac_13580-map_reduce-fh.patch + fixed multiline doctests.`

Answer 4 · 2014-04-09T02:56:21.000Z

Commit: 693c672

Answer 5 · 2014-04-09T02:56:21.000Z

Changed keywords from map-reduce to map-reduce, days57

Answer 6 · 2015-03-06T11:16:51.000Z

Changed branch from u/hivert/13580/map_reduce to u/nthiery/13580/map_reduce

Answer 7 · 2015-03-09T15:47:18.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`addb17a`	`13580: Trivial rest fix`

Answer 8 · 2015-03-09T15:47:18.000Z

Changed commit from 693c672 to addb17a

Answer 9 · 2015-05-08T22:01:24.000Z

Changed branch from u/nthiery/13580/map_reduce to u/hivert/13580/map_reduce

Answer 10 · 2015-05-20T22:49:54.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`92e6e68`	`Merge branch 't/13580/map_reduce' into 13580`

Answer 11 · 2015-05-20T22:49:54.000Z

Changed commit from addb17a to 92e6e68

Answer 12 · 2015-05-20T23:06:24.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`734c748`	`#13580 Fixed test after merging #6637`

Answer 13 · 2015-05-20T23:06:24.000Z

Changed commit from 92e6e68 to 734c748

Answer 14 · 2015-05-21T07:51:43.000Z

comment:13

I saw the following typo in map reduce file while looking at the previous commit: "As an example, ee compute"

Answer 15 · 2015-05-21T15:42:06.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`f234f60`	`#13580 : improved documentation`

Answer 16 · 2015-05-21T15:42:06.000Z

Changed commit from 734c748 to f234f60

Answer 17 · 2015-11-23T11:06:25.000Z

Changed commit from f234f60 to 7a33037

Answer 18 · 2015-11-23T11:06:25.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`171be75`	`Merge branch 'master' into t/13580/13580/map_reduce`
`bfdf33e`	`Fixed naming convention`
`7a33037`	`Work in progress on the DOC`

Answer 19 · 2015-11-26T11:09:05.000Z

Changed commit from 7a33037 to 168ceae

Answer 20 · 2015-11-26T11:09:05.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`168ceae`	`Added architecture picture for map_reduce`

Answer 21 · 2015-12-01T08:51:02.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`366fc17`	`More Doc`
`6c12752`	`Tested timeout option`

Answer 22 · 2015-12-01T08:51:02.000Z

Changed commit from 168ceae to 6c12752

Answer 23 · 2015-12-14T13:02:10.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`493bb52`	`Done the doc of Map/Reduce`

Answer 24 · 2015-12-14T13:02:10.000Z

Changed commit from 6c12752 to 493bb52

Answer 25 · 2015-12-14T13:57:46.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`2534f12`	`Moved map/reduce to sage/parallel`

Answer 26 · 2015-12-14T13:57:46.000Z

Changed commit from 493bb52 to 2534f12

Answer 27 · 2015-12-14T14:07:27.000Z

Description changed:

--- 
+++ 
@@ -1,2 +1,2 @@
-Implement a map reduce algorithm in parallel on large sets described by a `SearchForest`.
+Implement a map reduce algorithm in parallel on large sets described by a `SearchForest`. We use a work-stealing algorithm (see https://en.wikipedia.org/wiki/Work_stealing) based on Python's multiprocessing.

Answer 28 · 2015-12-14T16:49:12.000Z

comment:21

The branch currently have conflicts with development version of Sage (because the link is red in the description of the ticket above). The branch seems to be based on 6.2.beta6 which is old and may explain the presence of a conflict.

Answer 29 · 2015-12-14T19:38:34.000Z

comment:22

Replying to @seblabbe:

The branch currently have conflicts with development version of Sage (because the link is red in the description of the ticket above). The branch seems to be based on 6.2.beta6 which is old and may explain the presence of a conflict.

Yes ! there is a trivial conflict. Thank you for pointing it. I'm fixing it, testing and re-uploading.

Answer 30 · 2015-12-14T20:29:38.000Z

Changed commit from 2534f12 to e5b7477

Answer 31 · 2015-12-14T20:29:38.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`e5b7477`	`Merge 6.10.rc1 + fixed conflict`

Answer 32 · 2015-12-14T20:53:14.000Z

Changed commit from e5b7477 to 82fd1e4

Answer 33 · 2015-12-14T20:53:14.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`82fd1e4`	`Fixed links according to deprecation/rebase`

Answer 34 · 2015-12-14T20:56:30.000Z

comment:25

Replying to @hivert:

Replying to @seblabbe:

The branch currently have conflicts with development version of Sage (because the link is red in the description of the ticket above). The branch seems to be based on 6.2.beta6 which is old and may explain the presence of a conflict.

Yes ! there is a trivial conflict. Thank you for pointing it. I'm fixing it, testing and re-uploading.

Done ! I was actually based on 6.9.

Answer 35 · 2015-12-14T21:32:24.000Z

Changed commit from 82fd1e4 to 68b6530

Answer 36 · 2015-12-14T21:32:24.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`68b6530`	`Removed TODO in doc`

Answer 37 · 2015-12-14T22:18:13.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`b93ba7d`	`Fixed a link + doctest`

Answer 38 · 2015-12-14T22:18:13.000Z

Changed commit from 68b6530 to b93ba7d

Answer 39 · 2015-12-15T07:19:00.000Z

Changed commit from b93ba7d to 5c7720d

Answer 40 · 2015-12-15T07:19:00.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`5c7720d`	`Fixed doctest continuations and raise statements`

Answer 41 · 2015-12-15T07:20:34.000Z

comment:30

Replying to @sagetrac-git:


5c7720d	`Fixed doctest continuations and raise statements`

Patchbot was complaining about old style multiline doctests and exception raises. Fixed and reuped. Should be Ok now.

Answer 42 · 2015-12-15T16:03:05.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`cb9c011`	`Fixed another multiline doctest`

Answer 43 · 2015-12-15T16:03:05.000Z

Changed commit from 5c7720d to cb9c011

Answer 44 · 2015-12-15T20:14:04.000Z

comment:32

I just looked at the code in the browser. Looks good. I will do real code review later this week hopefully. Quickly, I saw two typos:

First line of /src/sage/parallel/map_reduce.py : Paralell

Later in the same file I think: parameters tu the

Answer 45 · 2015-12-15T21:10:55.000Z

Changed commit from cb9c011 to 14b086b

Answer 46 · 2015-12-15T21:10:55.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`14b086b`	`Fixed Sebastien's Typos`

Answer 47 · 2015-12-15T21:14:26.000Z

comment:34

Replying to @seblabbe:

I just looked at the code in the browser. Looks good. I will do real code review later this week hopefully. Quickly, I saw two typos:

Fixed ! I'm afraid you'll find more ! By the way, removed distributed from the title since I don't do distributed computation anymore. I had a prototype which allowed to launch those computation on a cluster of machines, but it induces a huge performance loss do I dropped It. maybe we will work on this with Jeroen...

Answer 48 · 2015-12-16T11:34:47.000Z

comment:35

Dear Florent, I know that you like multiline doctests, but for two reasons I prefer to avoid them as much as possible in doctests:

It is not fun for the user who wants to adapt an example after a copy paste in the console. The up arrow brings the multiline all at once and it is not fun to change the value 10 to a value 25 let say.
Using variables to store the input before calling the method gives lot of information easily to the user: what are the argument names, in what order should they be used.

I know I am asking you to change your style by asking you this, but don't you agree with me or can you provide some counter-arguments to mine? I give an example below:

diff --git a/src/sage/combinat/backtrack.py b/src/sage/combinat/backtrack.py
index 9dee057..ac1243a 100644
--- a/src/sage/combinat/backtrack.py
+++ b/src/sage/combinat/backtrack.py
@@ -695,17 +695,19 @@ class SearchForest(Parent):
                    reduce_function = None,
                    reduce_init = None):
         r"""
-        Apply en Map Reduce algorithm on ``self``
+        Apply a Map Reduce algorithm on ``self``
 
         EXAMPLES::
 
-            sage: F = RecursivelyEnumeratedSet( [([i],i, i) for i in range(1,10)],
-            ....:     lambda (list, sum, last):
-            ....:         [(list + [i], sum + i, i) for i in range(1,last)],
-            ....:     structure='forest', enumeration='depth')
+            sage: seeds = [([i],i, i) for i in range(1,10)]
+            sage: succ = lambda (list, sum, last): 
+            ....:               [(list + [i], sum + i, i) for i in range(1,last)]
+            sage: F = RecursivelyEnumeratedSet(seeds, succ, 
+            ....:                       structure='forest', enumeration='depth')
             sage: y = var('y')
-            sage: F.map_reduce(
-            ....:     lambda (li, sum, _): y**sum, lambda x,y: x + y,  0 )
+            sage: map_function = lambda (li, sum, _): y**sum
+            sage: reduce_function = lambda x,y: x + y
+            sage: F.map_reduce(map_function, reduce_function, 0)
             y^45 + y^44 + y^43 + 2*y^42 + 2*y^41 + 3*y^40 + 4*y^39 + 5*y^38 + 6*y^37 + 8*y^36 + 9*y^35 
 
         .. SEEALSO:: :mod:`sage.parallel.map_reduce`

I noticed another typo on the same line as the one parameters tu the : On -> One

Answer 49 · 2015-12-16T23:27:20.000Z

comment:36

Replying to @seblabbe:

Dear Florent, I know that you like multiline doctests, but for two reasons I prefer to avoid them as much as possible in doctests:

It is not fun for the user who wants to adapt an example after a copy paste in the console. The up arrow brings the multiline all at once and it is not fun to change the value 10 to a value 25 let say.

I'm tempted to answer that I don't have this problem with emacs ;-). Of course, for the people using a two letters editor...

Using variables to store the input before calling the method gives lot of information easily to the user: what are the argument names, in what order should they be used.

I'm much more convinced by this argument. I'm watching an exam tomorrow evening. I'll change my code accordingly.

Answer 50 · 2015-12-17T13:51:09.000Z

comment:37

Replying to @hivert:

I'm much more convinced by this argument. I'm watching an exam tomorrow evening. I'll change my code accordingly.

Actually, I'm not that much convinced. It's not clear for me that

      sage: map_function = lambda x: 1
      sage: reduce_function = lambda x,y: x+y
      sage: reduce_init = 0
      sage: S.map_reduce(map_function, reduce_function, reduce_init)
      131071

is much better than

      sage: S.map_reduce(
      ....:   map_function = lambda x: 1,
      ....:   reduce_function = lambda x,y: x+y,
      ....:   reduce_init = 0 )

In this second version which is the style I adopted in map_reduce.py, we don't repeat the name twice, which is better in my opinion.

A third possibility is

      sage: S.map_reduce(lambda x: 1, lambda x,y: x+y, 0)

It is the one used through the whole backtrack.py files. Unless for very short example, I don't find it very readable. But since it's not adding new code but changing old one, which needs even more cleanup and doc improvement, I'd rather keep it for another ticket. More precisly, I'd rather do that during #16351.

What do you think ?

Answer 51 · 2015-12-17T13:53:26.000Z

comment:38

I'm not sure to understand the patchbot complaint ? Is it because the map_reduce is not imported as Sage's startup ?

Answer 52 · 2015-12-17T14:02:02.000Z

comment:39

I just uploaded a doc improvement: the map_reduce function was missing the description of the input.

Answer 53 · 2015-12-17T14:02:36.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`e3c4c71`	`Doc improvements`

Answer 54 · 2015-12-17T14:02:36.000Z

Changed commit from 14b086b to e3c4c71

Answer 55 · 2015-12-17T18:13:02.000Z

Changed commit from e3c4c71 to 31c4735

Answer 56 · 2015-12-17T18:13:02.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`31c4735`	`Removed unneded import in pxd`

Answer 57 · 2015-12-18T14:23:10.000Z

comment:42

Replying to @hivert:

What do you think ?

I think that some doctest are more important than other, especially the first doctests that people might be expected to see when using the map reduce code. For example the map_reduce method in the recursively enumerated set file is one possible entry point where clarity of doctests matters more. Therefore, I like the way you change the doctest. Definitively, the top of the module is an important entry point. For other sub methods of the class map reduce, I care less...

Answer 58 · 2015-12-18T14:23:50.000Z

comment:43

Replying to @hivert:

I'm not sure to understand the patchbot complaint ? Is it because the map_reduce is not imported as Sage's startup ?

I don't see why neither...

Answer 59 · 2015-12-18T16:17:51.000Z

comment:44

Event, Condition, time and os are imported in map_reduce file but are not used.

Answer 60 · 2015-12-18T16:20:34.000Z

comment:45

Is there a reason why you use res=[""] and res[0] in print_communication_statistics instead of res="" and res?

Answer 61 · 2015-12-18T17:32:55.000Z

comment:46

Replying to @seblabbe:

Is there a reason why you use res=[""] and res[0] in print_communication_statistics instead of res="" and res?

Yes ! This is the classical trick to have a local variable shared with the local function (see e.g: http://stackoverflow.com/questions/2609518/python-nested-function-scopes).

Answer 62 · 2015-12-18T17:49:04.000Z

comment:47

Replying to @seblabbe:

Event, Condition, time and os are imported in map_reduce file but are not used.

Fixed in my last push.

Answer 63 · 2015-12-18T17:49:09.000Z

Changed commit from 31c4735 to cca7881

Answer 64 · 2015-12-18T17:49:09.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`cca7881`	`Removed unneeded import`

Answer 65 · 2015-12-18T17:52:51.000Z

comment:49

Replying to @seblabbe:

I don't see why neither...

According to Vincent on Sage devel it is probably a bug of the patchbot (see https://groups.google.com/forum/#!topic/sage-devel/xlpLwktA5Hk).

Answer 66 · 2015-12-21T10:56:59.000Z

comment:50

Replying to @hivert:

Replying to @seblabbe:

Is there a reason why you use res=[""] and res[0] in print_communication_statistics instead of res="" and res?

Yes ! This is the classical trick to have a local variable shared with the local function (see e.g: http://stackoverflow.com/questions/2609518/python-nested-function-scopes).

Could you add this information inside the code of that method?

Answer 67 · 2015-12-21T13:44:49.000Z

comment:51

map_reduce method of SearchForest should document what happen when this method is called with no arguments, i.e. returns the cardinality. More precisely, it should say that the value None for arguments map_function and reduce_function is replaced by what?

Answer 68 · 2015-12-21T13:54:26.000Z

comment:52

On can use the three following parameters: -> One can use the three following parameters:

Answer 69 · 2015-12-21T14:09:19.000Z

comment:53

I get the following doctests error on top on sage-6.10 on my machine:

Git branch: 13580
Using --optional=gcc,mpir,python2,sage
Doctesting 1 file.
sage -t --warn-long 85.4 map_reduce.py
**********************************************************************
File "map_reduce.py", line 960, in sage.parallel.map_reduce.RESetMapReduce._signal_task_start
Failed example:
    S._active_tasks
Expected:
    <Semaphore(value=2)>
Got:
    <Semaphore(value=unknown)>
**********************************************************************
File "map_reduce.py", line 964, in sage.parallel.map_reduce.RESetMapReduce._signal_task_start
Failed example:
    S._active_tasks
Expected:
    <Semaphore(value=3)>
Got:
    <Semaphore(value=unknown)>
**********************************************************************
File "map_reduce.py", line 985, in sage.parallel.map_reduce.RESetMapReduce._signal_task_done
Failed example:
    S._active_tasks
Expected:
    <Semaphore(value=2)>
Got:
    <Semaphore(value=unknown)>
**********************************************************************
File "map_reduce.py", line 989, in sage.parallel.map_reduce.RESetMapReduce._signal_task_done
Failed example:
    S._active_tasks
Expected:
    <Semaphore(value=1)>
Got:
    <Semaphore(value=unknown)>
**********************************************************************
2 items had failures:
   2 of   9 in sage.parallel.map_reduce.RESetMapReduce._signal_task_done
   2 of   7 in sage.parallel.map_reduce.RESetMapReduce._signal_task_start
    [249 tests, 4 failures, 43.04 s]
----------------------------------------------------------------------
sage -t --warn-long 85.4 map_reduce.py  # 4 doctests failed
----------------------------------------------------------------------

It seems ok on patchbot reporting seemingly unrelated errors :

----------------------------------------------------------------------
sage -t --long src/sage/interfaces/expect.py  # Timed out
sage -t --long src/sage/interfaces/gap.py  # 10 doctests failed
----------------------------------------------------------------------

Answer 70 · 2015-12-21T14:24:23.000Z

comment:54

I divided the review in a stack of tasks but since I am unable to parallelize my working time, I did the review serially with many interruptions during the previous days:)

I finished to look at what I wanted to look at. The documentation compiles good and helps to understand what is happening. I was able to test the parallel computation on at least one extensive example. To me it will be a positive review after my previous four comments are answered.

Answer 71 · 2015-12-21T14:49:03.000Z

comment:55

Replying to @seblabbe:

I get the following doctests error on top on sage-6.10 on my machine:
Expected:
    <Semaphore(value=1)>
Got:
    <Semaphore(value=unknown)>

Is this happening on MacOS ? I know that Semaphore are implemented differently on this system. See in particular the note after class multiprocessing.BoundedSemaphore, where it says:

on Unix platforms like Mac OS X where sem_getvalue() is not implemented.

If so and if the rests behave correctly, I'm tempted to replace the doctests with

    <Semaphore(value=...)>

What do you think ?

Answer 72 · 2015-12-21T14:58:26.000Z

comment:56

Replying to @seblabbe:

map_reduce method of SearchForest should document what happen when this method is called with no arguments, i.e. returns the cardinality. More precisely, it should say that the value None for arguments map_function and reduce_function is replaced by what?

I didn't document this one because I'm was not sure It was a sensible default. If you think it is then let's go for it.

By the way thanks for your careful review !!!

Florent

Answer 73 · 2015-12-21T15:17:13.000Z

Changed commit from cca7881 to 1a8b78e

Answer 74 · 2015-12-21T15:17:13.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`1a8b78e`	`- MacOSX compatible doctests for semaphore.`

Answer 75 · 2015-12-21T15:18:35.000Z

comment:58

I uploaded a patch which should address all your concerns:

tests on MacOSX
typos
default values documentation

Answer 76 · 2015-12-21T19:25:20.000Z

comment:59

Introduced typo in the previous commit: the function costant to 1. -> constant

Answer 77 · 2015-12-21T19:29:59.000Z

comment:60

Replying to @hivert:

Replying to @seblabbe:
I get the following doctests error on top on sage-6.10 on my machine:
Expected:
    <Semaphore(value=1)>
Got:
    <Semaphore(value=unknown)>
Is this happening on MacOS ?

Yes

If so and if the rests behave correctly, I'm tempted to replace the doctests with
    <Semaphore(value=...)>
What do you think ?

I agree. All tests passed now...

Answer 78 · 2015-12-21T19:33:59.000Z

comment:61

Replying to @hivert:

I didn't document this one because I'm was not sure It was a sensible default. If you think it is then let's go for it.

I think it is a good default. Also, sometimes, you may want to set only some of the arguments and this information allows it. Finally, it's good to see that cardinality is a special case of this method.

Answer 79 · 2015-12-21T19:51:51.000Z

Reviewer: Sébastien Labbé

Answer 80 · 2015-12-22T12:53:19.000Z

comment:63

Just to be clear: I am waiting for the costant -> constant fix to change the status to positive review.

Answer 81 · 2015-12-24T08:16:30.000Z

Changed branch from u/hivert/13580/map_reduce to u/nthiery/13580/map_reduce

Answer 82 · 2015-12-24T08:18:23.000Z

comment:65

Typo fixed, hence changing the status to positive review on Sebastien's behalf.

New commits:

`98fd9e1`	`13580: typo fix`

Answer 83 · 2015-12-24T08:18:23.000Z

Changed commit from 1a8b78e to 98fd9e1

Answer 84 · 2015-12-24T17:13:25.000Z

comment:66

Fails on OSX: http://build.sagedev.org/release/builders/%20%20fast%20Volker%20MiniMac%20%28OSX%2010.10%20x86_64%29%20incremental/builds/636/steps/shell_4/logs/stdio

Answer 85 · 2015-12-26T14:25:08.000Z

comment:67

Hello and merry christmas every one,

I have some question on this ticket.

Firstly, I don't understand the documentation from the line 571:

    Decription of the map/reduce operation:

    - ``map_function=f`` -- (default to ``None``)
    - ``reduce_function=red`` -- (default to ``None``)
    - ``reduce_init=init`` -- (default to ``None``)

What means f, red and init? My opinion is that f, redand init are irrelevant but an example should be interesting at this point:

(copy-paste of one in the module documentation)

 sage: from sage.parallel.map_reduce import RESetMapReduce
 sage: S = RESetMapReduce(
 ....:   roots = [[]],
 ....:   children = lambda l: [l+[0], l+[1]] if len(l) <= 15 else [],
 ....:   map_function = lambda x : 1,
 ....:   reduce_function = lambda x,y: x+y,
 ....:   reduce_init = 0 )
 sage: S.run()
 131071

Ok, there is the seealso which says go to see the documentation of the module for examples, but one example there is interesting, no?

Then it seems there is some useless import (like line 1345 from multiprocessing import current_process) (I try to compile sage soon and I can remove all useless import if you want).

Finally, I'm not sure to understand the use of AbortError. It is used to abort the computation and to say to the workers and the thiefs every think is done?

Jean-Baptiste

Answer 86 · 2015-12-28T09:48:47.000Z

comment:68

Replying to @vbraun:

Fails on OSX: http://build.sagedev.org/release/builders/%20%20fast%20Volker%20MiniMac%20%28OSX%2010.10%20x86_64%29%20incremental/builds/636/steps/shell_4/logs/stdio

I can not reproduce this problem on my Mac OS X Yosemite 10.10.2:

$ sage -tp --long src/sage/parallel/map_reduce.py 
Running doctests with ID 2015-12-28-10-44-31-65e477d9.
Git branch: 13580
Using --optional=gcc,mpir,python2,sage
Doctesting 1 file using 2 threads.
sage -t --long --warn-long 85.0 src/sage/parallel/map_reduce.py
    [252 tests, 40.89 s]
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------

Answer 87 · 2016-01-06T10:56:19.000Z

comment:69

Hello,

I can reproduce test failures on El Capitan (os x):

I run 3 tests and I obtained 3 distincts failures. My problem is that if I try to execute those lines which fails in sage terminal (several times) then it does not fail!

Is it possible that the test environment is not robust specially on mac os?

Answer 88 · 2016-01-11T09:17:16.000Z

comment:70

I read the code and I don't know how to understand this:

res = post_process(node)
if res is not None:
    self._res = reduc(self._res, mapp(res))

in the worker code of walk_branch_locally method.

I think this is useless, no? Or if not then the documentation of post_process function should be improve to specify that if post_process returns None then there is a specific behavior.

Answer 89 · 2016-01-11T09:33:26.000Z

comment:71

Replying to @sagetrac-elixyre:

res = post_process(node)
if res is not None:
    self._res = reduc(self._res, mapp(res))
I think this is useless, no? Or if not then the documentation of post_process function should be improve to specify that if post_process returns None then there is a specific behavior.

Ok, this behavior is specify in the documentation of RESetMapReduce but this is not specified in paragraph (line 135) associated of this module.

Answer 90 · 2016-01-11T09:35:44.000Z

comment:72

Furthermore the lines:

for child in newnodes: 
    self._todo.append(child)

of walk_branch_locally should be replaced by

self._todo_.extend(newnodes)

(I just notice that I read and if you are agree with my remark then I can replace that)

Answer 91 · 2016-01-11T10:08:16.000Z

comment:73

Go back to None activities, if I assume some users want to use nodes which could be None then my first idea is to use the post_process function to avoid problem but... if the thief doesn't receive then he send None and he says there is no job to do.

I propose two solutions:

first one, to specify that None is not a node never! (this is assumed but this must be explicited in the documentation),
second one (the object way), to define an object NoTask.

Opinion?

Answer 92 · 2016-01-12T08:44:42.000Z

comment:74

Replying to @sagetrac-elixyre:

Go back to None activities, if I assume some users want to use nodes which could be None then my first idea is to use the post_process function to avoid problem but... if the thief doesn't receive then he send None and he says there is no job to do.

I propose two solutions:

first one, to specify that None is not a node never! (this is assumed but this must be explicited in the documentation),

I'm clearly in Favor of this solution. We assume that None doesn't blong to the set described by the underlying RESet. Note that this is something which concerns more the underlying RESet than the map-reduce computation.

Answer 93 · 2016-01-12T09:55:17.000Z

Changed branch from u/nthiery/13580/map_reduce to u/hivert/13580/map_reduce

Answer 94 · 2016-01-12T09:56:33.000Z

New commits:

`6e3bbd3`	`More logging for debugging`

Answer 95 · 2016-01-12T09:56:33.000Z

Changed author from Florent Hivert, Nathann Cohen to Florent Hivert, Jean-Baptiste Priez, Nathann Cohen

Answer 96 · 2016-01-12T09:56:33.000Z

Changed commit from 98fd9e1 to 6e3bbd3

Answer 97 · 2016-01-12T10:02:13.000Z

Branch pushed to git repo; I updated commit sha1. New commits:

`98fd9e1`	`13580: typo fix`
`6b4a5cf`	`Merge branch 'u/nthiery/13580/map_reduce' into t/13580/map_reduce`

Answer 98 · 2016-01-12T10:02:13.000Z

Changed commit from 6e3bbd3 to 6b4a5cf

Answer 99 · 2016-01-15T15:53:49.000Z

comment:78

Replying to @vbraun:

Fails on OSX: http://build.sagedev.org/release/builders/%20%20fast%20Volker%20MiniMac%20%28OSX%2010.10%20x86_64%29%20incremental/builds/636/steps/shell_4/logs/stdio

We finally got the problem with Jean-Baptiste. It occurs that semaphore are broken on MacOSX (or at least are not fully POSIX compliant). In particular, on standard unixes, when two processes are trying to acquire a semaphore whose value is more than two, they always both succeeded. On MacOS, one of them may fail. As a consequence, I'm writing a different code form MacOS relying on a Lock and a shared integer. It may be slower on system where semaphore are implemented in a lockless way.

Is there a standard Sage/Python way to check if we are on MacOS ?

Answer 100 · 2016-01-17T20:00:07.000Z

New commits:

`44566e6`	`ticket search forest map reduce with no problem on mac os x`