ocaml-multicore/domainslib

Chan.recv seems to remove domain from pool

nikolaushuber opened this issue · 1 comments

I have written a small example to test the interaction of Chan.recv and task pools:

module T = Domainslib.Task 
module C = Domainslib.Chan 
let num_domains = Sys.argv.(1) |> int_of_string 

let print_mutex = Mutex.create () 

let print s = 
  Mutex.lock print_mutex; 
  print_endline s; 
  Mutex.unlock print_mutex

let ping = C.make_bounded 1 
let pong = C.make_bounded 1 
let pang = C.make_unbounded ()  


let run_async name p () = 
  let rec f () : unit = 
    C.recv p; 
    print name;
    C.send pang (); 
    f () 
  in 
  f () 

let () = 
  let pool = T.setup_pool ~num_domains:(num_domains - 1) () in  
  T.run pool (fun _ -> 
    let _ = T.async pool (run_async "A" ping) in 
    let _ = T.async pool (run_async "B" pong) in 


    while true do 
      C.send ping (); 
      C.send pong (); 
      C.recv pang
    done
  ); 
  T.teardown_pool pool 

The output of the program above depends on the input parameter num_domains. When called with 1, the program immediately blocks. When called with 2, it outputs "A" "A" and then blocks. If called with >= 3 it runs indefinitely, as expected. It seems to me, that a call to Chan.recv on an empty channel effectively blocks and removes the current domain from the pool.

Is that the expected behaviour?

I haven't gone through all the details, but I believe the problem is that Chan uses Stdlib Mutex and Condition to block. The problem with those is that they prevent other fibers (spawned with async) from running. Ideally your example would be able to run with just one domain (like the example I posted on discuss using Kcas). In the future, perhaps, domainslib internals get rewritten using something like Picos and everything will just work™️. 😄