kennetpostigo/lwt-node

Implementation: replace most uses of `try_bind` with `on_any`

aantron opened this issue · 17 comments

Taking Fs.read as an example:

https://github.com/kennetpostigo/reason-node/blob/0fa0a310f8ef82c2ae77ee9deefb0ac47b115a07/src/fs.re#L281-L295

This calls try_bind, so the whole function has type arguments => arguments => Lwt.t unit, i.e. it returns a final promise. This final promise represents the whole computation of (1) asking Lwt_unix to perform the read, (2) running the appropriate user callback after that. AFAIK we don't really want that final promise, we want the return type to be just unit.

You can get that by replacing try_bind with on_any.

Either way, I think there also needs to be special handling of exceptions thrown by the user's callbacks. With on_any, if any exception reaches the implementation of on_any (inside Lwt), Lwt really has nothing to do with it except terminate the entire process – it's an unhandled exception that has nowhere to go. By comparison, try_bind creates that final promise, so it stuffs unhandled exceptions into it, by rejecting it. However, in reason-node, the user will always ignore the final promise, so the exception will be silently lost, which is arguably even worse.

I guess reason-node needs to emulate whatever Node does when an exception occurs in a callback.

@aantron Yea, totally right. We definitely want to return unit instead of a promise. Silently eating errors is something JS is really good at and people despise this so we definitely don't want to to just eat them.

In the current code if an error arises the "catch" code runs and calls the developers callback with the error for them to handle.

https://github.com/kennetpostigo/reason-node/blob/0fa0a310f8ef82c2ae77ee9deefb0ac47b115a07/src/fs.re#L288-L294

Is there a strategy with on_any to pass the error to the user for them to handle the exceptions/error?

I think translating the current code to on_any would looks something like:

let read fd buffer offset length position callback => { 
  Lwt.on_any 
    (fun () => Lwt_unix.read fd buffer offset length) 
    (fun buff => { 
      callback Ok (Some buff); 
      Lwt.return (); 
    }) 
    (fun 
      | Unix.Unix_error e _ _ => { 
          callback (Err e) None; 
          Lwt.return (); 
        } 
      | exn => Lwt.fail exn 
    ); 
}; 

I literally just changed the name, but haven't ran the code. Tests of on any seem to be the same as this.

When translating to on_any, you need to also not generate promises at the end of your wrapped callbacks (so no Lwt.return (), Lwt.fail exn).

The exceptions I am referring to aren't the ones raised by Lwt_unix.read and other I/O functions, but rather the ones raised inside the user's callback. There is no problem with handling I/O errors: both try_bind and on_any cause callback to run in response to those. The question is, what should happen if callback itself raises an exception?

In particular, neither try_bind, nor on_any, run the failure version of callback if the success version of callback fails, nor do they run the failure version again if the failure version fails. I think this is also true in the Node.js API.

Oh, the other thing with on_any is that the first argument is an Lwt.t 'a, not unit => Lwt.t 'a, so you'll want to do something like

Lwt.on_any (Lwt.wrap (fun () => Lwt_unix.read ...)) ...

As much as I dislike the existence of wrap in the API. I actually think you should be able to get away with

Lwt.on_any (Lwt_unix.read ...)

without the wrap, if read is written correctly, and if it's not written correctly, we should fix it upstream in Lwt.

@aantron In nodejs I think the app crashes with the exceptions. I know promises do crash the application if unhandled in js. There are so many things that a developer could be doing in the callback, not sure what the best way to handle exceptions in ocaml, is there a wildcard for exceptions? As in all exceptions inherit from a "root" exception that we can catch?

Well, there is the wildcard pattern, which you are already using :)

try
  stuff
with _ =>
  failed

What do you mean promises crash the application? I thought raising an exception in a .then callback causes the promise to get rejected?

If Node.js callbacks aren't allowed to raise exceptions, and violating this crashes the application, then it seems you already have the right behavior by default from Lwt. The only thing to watch out for, is that if someone changes Lwt.async_exception_hook, they will affect both the behavior of Lwt code (as expected) and reason-node (which they might not intuitively expect).

@aantron what I meant is that if promises don't have a catch statement or a second argument to the then statement then the app will crash with unhandled promise rejection.

Ah, okay, thank you :)

@aantron I don't believe there is a way to guard against someone changing Lwt.async_exception_hook right? Since it is global?

Yep, that's right. But you could isolate reason-node from it by writing your own handlers inside on_any. Not saying it's actually worth it (don't know).

I think for the initial release, will leave it as is, add a note for people and a link to open an issue if this is something that devs really want. Does that sound like a good idea?

Yep, I think so. It might not even be a problem for anyone, just being thorough :)

@aantron I was also thinking, what if I wrap Lwt.run with try catch? I think that would catch exceptions at the outer boundary?

That would have the benefit of returning Lwt `t a that Lwt.run expects

I'm assuming that the outer boundary is some top-level main function in the user's code. Lwt_main.run only translates the rejection of the one promise that you pass to it into an exception. Since on_any doesn't create a promise, and so doesn't put exceptions from callbacks into any promise, it wouldn't help with that. I'm not sure if you're suggesting it for some other purpose, though.

@aantron that makes sense, it wouldn't catch all the exceptions. I need to learn more about the Lwt internals and operators, lol. What would you recommend I do to keep this style for Node.run (which is just Lwt.run):

Node.run {
  Fs.mkdir "testDirAsync0" 416 (Fs.(fun
    | Ok => print_string "\n\n====OK====\n\n"
    | Err e => print_string "\n\n====ERR====\n\n"
  ));
};

since now all the operators return unit, I was thinking of wrapping it like so:

let run a => {
  Lwt_main.run {
    a;
    Lwt.return ();
  }
};

Looking at the usage example, it looks like you're passing unit to ReasonNode.run, which seems reasonable for the API you're designing. So you'd have

let run () =>
  (* ... Lwt_main.run somehow ... *)

You don't want to pass the result of Lwt.return () to Lwt_main.run, though, because Lwt_main.run only "runs" the I/O loop as long as the promise isn't resolved. Lwt.return () starts out already resolved. You want to do something like

let run () => Lwt_main.run (fst (Lwt.wait ())

Lwt.wait () creates a fresh pending promise, paired with its resolver. fst just forgets the resolver, so we have a forever-pending promise. This will make ReasonNode.run never return, so theoretically you should be able to give it the type

ReasonNode.run : unit => 'a

Does Node.js have some way of getting out of run? IIRC there is no run in Node, in which case whether you want ReasonNode.run to ever return or not depends on what your goals are :) You could make it exitable like so:

let (until_done, signal_done) = Lwt.wait ()

let run () => Lwt_main.run until_done

let stop () => Lwt.wakeup_later signal_done ()

i.e. all stop does is resolve a global (but hidden in the module) until_done promise, which Lwt_main.run is waiting for. Then you'd have

ReasonNode.run : unit => unit

@aantron node.js doesn't have a way to get out of run, since it's built into the runtime, however you can kill the process using exit:

process.exit() // passing 0 exit's with success, where passsing 1 exit's with failure.

The above code will exit node

The event loop is implicit, so the user never really sees it (it's built into runtime.)