Yuzhen11/flexps

Make the mailbox support general types

Yuzhen11 opened this issue · 0 comments

Now the mailbox only support Message type which is designed for the communication between worker and server. However, we may extend the mailbox to support general type in some cases, e.g., scheduler and worker communication, worker to worker communication, and hdfs_assigner and worker communication.

Option 1: Add a BinStream attribute into Message. Problems: 1. Message contains one more attribute and the meaning will become more unclear. 2. BinStream cannot utilize the zero-copy functionality used in current mailbox (zmq_msg_init_data).

Option 2: Build a SarrayBinStream on top of Sarray and provide a function to return a Message from SarrayBinStream. Problems: 1. the underlying data structure of SarrayBinStream is now Sarray which used shared_ptr to do the reference counting. Lock is required. May be slower.

Option 3: Let the current mailbox depends on raw buffer only (char*) but no SArray. Provide another level abstraction so that BinStream and SArray can all utilize the mailbox. Problems: Do not know how to implement it. The zero-copy functionality requires a callback function to clear the buffer when data is sent. Current version new an sarray and in the callback delete the sarray which incurs no copy but just reference increment. For BinStream, we may try to get the buffer out and let the zmq to delete the underlying buffer. But the problem is if we provide a function to get the buffer out from BinStream (of course we cannot use std::vector as the underlying storage) then we can not make the BinStream RAII. Seems that to accomplish this without abandoning RAII, reference counting like sarray is the only solution. How about we abandon RAII?

Currently Option 2 seems the best to me.