pytorch/data

Mux with MPRS causes operations after sharding_round_robin_dispatcher to run on the same worker

JohnHBrock opened this issue ยท 3 comments

๐Ÿ“š The doc issue

This doesn't seem to be mentioned in the docs, but if you have two datapipes that use sharding_round_robin_dispatcher and then mux them together:

  1. Any steps between sharding_round_robin_dispatcher and mux will take place on the same worker process.
  2. Only the steps after the mux will take place on separate workers.

For example, with the below graph, the Mapper nodes in between the ShardingRoundRobinDispatcher nodes and Multiplexer run on the same worker process. The Mapper node after Multiplexer will run across multiple processes as they're fed data in a round-robin fashion.
image

My incorrect expectation was that the dispatching process would distribute data to worker processes immediately after sharding_round_robin_dispatch as usual, and then everything after mux would take place on either one or multiple worker processes.

Suggest a potential alternative/fix

The documentation for Multiplexer, ShardingRoundRobinDispatcher, and/or MultiProcessingReadingService should be updated to clarify what the intended behavior is here.

ejguan commented

I am sorry that I think we currently don't support two ShardingRoundRobinDispatcher

I think that's worth putting in the docs -- I just looked and couldn't find a mention of that limitation.

I am sorry that I think we currently don't support two ShardingRoundRobinDispatcher

This should potentially be taken into consideration as a usecase with regards to #1174