ocaml-multicore/domainslib

Off-by-one error in `Task.parallel_scan`

jmid opened this issue · 3 comments

jmid commented

Consider the following program:

open Domainslib

let print_array a =
  let b = Buffer.create 25 in
  Buffer.add_string b "[|";
  Array.iter (fun elem -> Buffer.add_string b (string_of_int elem ^ "; ")) a;
  Buffer.add_string b "|]";
  Buffer.contents b

let scan_task num_doms =
  let pool = Task.setup_pool ~num_additional_domains:num_doms () in
  let a = Task.run pool (fun () -> Task.parallel_scan pool (+) (Array.make 10 1)) in
  Task.teardown_pool pool;
  Printf.printf "%i:  %s\n%!" num_doms (print_array a);
;;
for num_dom=0 to 10 do
  scan_task num_dom;
done

It runs a Task.parallel_scan with operator (+) - an operator which is both commutative and associative as required.

The result however varies with the the domain count:

$ dune exec ./simple.exe
0:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
1:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 5; |]
2:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
3:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
4:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
5:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
6:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
7:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
8:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
9:  [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10; |]
Fatal error: exception Invalid_argument("index out of bounds")

Note how for ~num_additional_domains:1 the final array entry is wrong.
Also note how ~num_additional_domains:10 triggers an "index out of bounds" error (this can also be treated as a separate issue).

I'm experiencing this with both multicore 4.12.0+domains and a recent 4.14.0+domains,
with the latest domainslib from the repo,
all running on an Intel dual-core x86_64 Thinkpad (w/4 CPU threads) with Linux 5.4.0-91-generic.

@Sudha247 We had one-off errors previously as well. Do you know if these are related? Could you have a look at this, please?

Thanks for the report @jmid! Please see #60 for a proposed fix.

Fixed by #60.