uber/RemoteShuffleService

Does RSS support multiple StreamServers on the same node?

Closed this issue · 4 comments

RSS not support multiple disk directories so can I run multiple StreamServers on one node and each is specified with one disk directory? Or I can only use LVM to map multiple disk into one directory and point it to the only StreamServer?

One more question
I run a Spark-SQL using No-RSS and RSS,both applications have the same input
But why the data size of Stage 10 Shuffle Read differ from each other so much?

image

cpd85 commented

shuffle size differences is likely due to compression. as far as i can tell RSS doesn't offer the same compression that spark ESS does. i think you could run multiple stream servers, but I've got a diff in progress for multiple directories, its not fault tolerant at the moment but let me know if it interests you

RSS does support multiple StreamServers on the same node, as long as they use different ports and different data directories.

shuffle size differences is likely due to compression. as far as i can tell RSS doesn't offer the same compression that spark ESS does. i think you could run multiple stream servers, but I've got a diff in progress for multiple directories, its not fault tolerant at the moment but let me know if it interests you

THANKS for the reply. It reminds me that Zeus uses efficient coding/encoding.
I would try running multiple stream servers first at the present. I'll contact you when I need it

RSS does support multiple StreamServers on the same node, as long as they use different ports and different data directories.

THANKS! Some exceptions happened before, maybe caused by 2 rss pointed to the same directory.
Now both work after each rss starts with respective port and directory