Implement zmq-lwt.

Question

Implement zmq-lwt.

andersfugmann opened this issue 7 years ago · 20 comments

Just a head up:

Now that we have a zmq-async, I have started working on zmq-lwt by creating a functor to abstract the concurrency monad.

This would essentially implement the same interface as https://github.com/hcarty/lwt-zmq/blob/master/src/lwt_zmq.ml, but keep the implementations identical, albait factorized by the concurrency monads (Async_kernel.Deferred.t / Lwt.t).

I will create a branch today, as I do have some questions about both implementations.

Answer 1 · 2018-03-08T09:31:02.000Z

cc @hcarty.

Answer 2 · 2018-03-08T14:54:08.000Z

cc @rgrinberg

Answer 3 · 2018-03-08T15:35:40.000Z

I agree with going in this direction. That is the next step to take after hcarty included his zmq-lwt. I also recommend that we hold off the release of the these 2 sub libs until we have a basic functor in place. Just so that we provide a consistent API for both libs. A functor is a great start, but think a functor alone isn't sufficient for providing idiomatic async and lwt bindings. They have slightly different interface conventions that might be tough to do uniformly. Also, we might want to look into tighter integration with async io for both libs. That is epoll for async and whatever it is for lwt. But that could definitely wait for later versions. Anyways, I'm excited to see what comes out of this.

…

On Thu, Mar 8, 2018, 10:11 PM Anders Peter Fugmann ***@***.***> wrote: cc @rgrinberg <https://github.com/rgrinberg> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#54 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAIe-2EaTNF1y5Bi4teo4t4KZeLNlWKUks5tcUYRgaJpZM4SiX15> .

Answer 4 · 2018-03-09T05:36:26.000Z

Basic import PR in as #57

Answer 5 · 2018-03-10T15:18:00.000Z

I have updated PR #56 with a version that I believe is working. However its completely currently untested so I will need to write some tests.

@rgrinberg Can you give examples of idiomatic Lwt / Async in the context of the deferred bindings? I don't see how they are not idiomatic to both.

Answer 6 · 2018-03-11T05:10:45.000Z

Sure, so basically idiomatic Async accepts slightly different types than Lwt. For example, it should have a function to send an Iobuf.t, which is the preferred type for 0 copy IO. Another common type that core writers accept is a bigstring/bigsubstring. Which are just core's wrappers for bigarrays. Basically this requires additions to the Async API, so it shouldn't really affect your functor work as far as I can tell.

Answer 7 · 2018-03-13T17:43:26.000Z

I completed what I believe is a safe version version of zmq-async / zmq-lwt, which shares the same implementation apart from the monad which is functorized. The implementation is based on what I can read from the ZMQ documentation, and tested through the unittests (which actually exposed some deadlock and fd bugs in the code). See #56

I elected to have two threads running in the background.
event
This thread monitors the state of the socket and makes appropriate actions (sending if there are senders on the queue and the socket accepts send without blocking, and recv if there are messages pending and waiters). The challenge is that the state of the socket can change though call to send/recv. and may not be caught by the fd, so when we sleep on the fd, we must be able to be woken up if the socket is sent to or read from.

I could have elected to let every send / recv monitor the socket, but the every send/recv would wakeup all other threads waiting with is more work for the scheduler than needed.

fd_monitor
When asked (by notification on the fd_condition), the fd_monitor will send a signal to the event thread once the fd becomes readable. This makes it simple to "wait for fd or other event" without spawining a new fd thread (which also does not work well with async).

There are still caveats. If the ZMQ.Socket is used both though async/lwt and outside, then the user may inadvertently make the system loose important events (reading state/sending/receiving), causing a deadlock.

Alternative approach
ZMQ supports poll which resembles unix select. We could fire up a global seperate (unix) thread that does the poll, and signals when registered socket is ready to send or to receive. When a deferred thread wants to send it would (after consulting the state) register the socket with the 'global' thread.
(By global thread, I mean one thread per zmq context).

Answer 8 · 2018-03-13T21:41:19.000Z

I haven't had a chance to give #56 a try in real code yet but it seems very promising!

Does the alternative approach help in the case where a user may (hopefully accidentally...) mix async/lwt and vanilla calls to a socket?

Answer 9 · 2018-03-13T21:44:50.000Z

No. Mixing lwt/async socket operations and vanilla (blocking) calls will fail horribly, as we might miss events. The documentation says that the state of the socket might change when calling send/receive and this might not be reflected on the fd.

Answer 10 · 2018-03-15T19:24:46.000Z

@rgrinberg, @Leonidas-from-XIV, @hcarty. Any comments on this?

Answer 11 · 2018-03-15T21:26:50.000Z

Btw. I compared the test time on this branch with that of the current (master zmq-lwt implementation), when increasing the number of message by x10 (one of the test now sends 8000 message).
Master:

$ time /tmp/test.exe 
.......
Ran: 7 tests in: 1.86 seconds.
OK

real	0m1.894s
user	0m1.270s
sys	0m0.599s

New functorized implementation (#56):

$ time _build/default/zmq-lwt/test/test.exe
.......
Ran: 7 tests in: 0.29 seconds.
OK

real	0m0.323s
user	0m0.193s
sys	0m0.104s

The difference is really noticeable. When increasing by yet another order of magnitude, the current implementation takes forever (100% cpu), where as this implementation completes in 1.2 seconds.

Answer 12 · 2018-03-15T21:43:20.000Z

@andersfugmann That performance result looks amazing.

Regarding the two options, I'm not sure we can use a separate Unix thread for these operations as most zeromq sockets are not safe to share between threads. I don't know if that holds for polling as well as send/recv operations but it seems somewhat risky.

So I vote for the first, currently implemented option. Do you think your PR is ready for testing in an existing codebase? I can do some tests early next week if everything goes smoothly between now and then.

Answer 13 · 2018-03-15T21:55:09.000Z

You are right in that ZMQ might not be thread safe, so I also vote for the first (current implemented) solution.

Yes - I do think the code is ready for serious testing.

The only thing missing is the async sexp stuff but I'll see if I can add these just for the async interface to not introduce a dependency on Janestreet libs for the lwt implementation.

Answer 14 · 2018-03-17T17:57:11.000Z

Some simple tests in an application using which currently uses lwt-zmq (changing only Lwt_zmq to Zmq_lwt in the codebase) results in effectively the same performance, maybe a slight improvement, when there is little to no concurrency. I also haven't seen any change in behavior or stability in these tests. This is all running under a VM on my laptop.

I'm pretty happy with these results.

Answer 15 · 2018-03-18T08:37:57.000Z

I'm happy to hear that. I think the speed increase in the tests to be a result of only waking up one thread at a time to do read write, rather than all threads. This changes complexity from O(n^2) to O(n). Under low concurrency (few waiters and few messages), I would actually expect the difference to be neglectable.

What is your take on a merge? Have you had a chance to take a look at the code in detail.

Answer 16 · 2018-03-18T08:40:40.000Z

Btw. I give up on adding sexp of Socket.t for just the zmq-async. If we want it in maybe we should depend on base and add it for both implementations. Adding it for one is a hazzle and just gives extra maintenance.

Answer 17 · 2018-03-18T17:37:48.000Z

I'm ok with a merge. Then we can add Msg.t support and any other missing pieces necessary from the core library.

I'd personally prefer to avoid adding a dependency on Base in the core library if we can help it. There are relatively frequent backwards-incompatible updates to Base and related libraries so it's easy for libraries depending on them to fall out of sync with the latest and greatest.

Answer 18 · 2018-03-19T16:32:16.000Z

@rgrinberg Do you have some comments regarding the sexp stuff being removed before we merge?

Answer 19 · 2018-03-19T23:25:37.000Z

No, I'm ok with it. I'm pretty sure the problem can be solved without depending on base by using sexplib0, but I'll try solving it by only introducing the dependency in zmq-async

…

On Tue, Mar 20, 2018, 12:32 AM Anders Peter Fugmann < ***@***.***> wrote: @rgrinberg <https://github.com/rgrinberg> Do you have some comments regarding the sexp stuff being removed before we merge? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#54 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAIe-4hi2tqgHC6ZzAuvHt6a7rOVS5q8ks5tf92RgaJpZM4SiX15> .

Answer 20 · 2018-03-20T21:15:44.000Z

#56 is merged. Closing.