bytedance/monoio

Sharding

likeabbas opened this issue · 6 comments

Is your feature request related to a problem? Please describe.
In many cases it would be good to shard a request to a particular based on some data within a server request. This would allow me to ensure any state related to that merchant can be processed on the same core/thread.

Here's a few scenarios where this would make sense

  • a multi threaded implementation of onetimesecret. The first request that creates a secret would be shared to a core with a unique identifier. We would save the secret into a hash map that only exists on that thread/core. That unique identifier would appended to the response url for the user to share the link. When that link is used, it hits the server with the unique identifier to shard the request to the proper core, to allow us to get the secret from the hash map and return the data to the user
  • At one of my prior companies it was really common to have a multi threaded backend service with a 2 layer cache (one layer local and one layer distributed using reds). Often we would want to cache some data for a particular merchant due to high latencies from dependencies. To get our P90 latency down to single milliseconds we had a local cache combined with a distributed cache. The local cache would always need to be multi thread safe. However in monoio if we could shard, then we would not need to use a thread safe data structure which would increase our performance even more.

Describe the solution you'd like
A similar implementation to sharing like glommio has https://github.com/DataDog/glommio/blob/master/examples/sharding.rs

Describe alternatives you've considered
I haven't considered any

Additional context
None

I think it is a really useful thing, and it doesn't need to be binded with runtime itself. It can be used by users of all types of runtimes.
If you are interested, you can write one based on tokio channel or async_channel. Maybe we can build it together?

I would love to work on it with you. Although I don't know what direction to take when you say you don't think sharding needs to be bound to the runtime itself. Could you expand on your thoughts on this, and maybe give some guidance on what you think this would look like?

I agree tokio_channel would probably make the most sense for the implementation. But I don't know what that would look like without the runtime having direct support for it.

Some of my thoughts:

  1. It can be impl as multiple spsc channel.
  2. Each receiver may specify its capacity.
  3. The component allows user set lb strategy, like some consistent hashing parameters including hash method, ring size and virtual node count.
  4. Maybe receivers can join and leave at any time.
  5. If dest receiver's buffer is full, consider what strategy will be used.

I look forward this too, it would be really helpful.

I did not look into it yet, but would it be possible if instead of sharding with spsc, we have something like:

-> A TcpListener which accept a connection on one thread
-> Accept it, wait for some data and peek over some data
-> Find it's not the for this thread
-> "Send" the fd to the proper thread which will be able to handle the whole connection without needing another accept

EDIT: If someone is looking, you should be able to build a lb strategy & a full mesh by using https://github.com/Miaxos/sharded-thread

This is what I had in mind, but it seems we are missing some implementation on monoio to be able to do something like that:

https://github.com/Miaxos/sharded-thread/blob/eb2d34bbe9ed81a906921729c9e3359ac7b347e4/tests/monoio.rs#L88-L187

It would need to pub some of the method of SharedFd and add a way to destroy it without running the close code which close the TcpStream. If you are ok with it, I'm going to have a try.

(ugly, but well, this is the idea)

@Miaxos that looks awesome!