Lazy Parallelization
hyunwoongko opened this issue · 0 comments
hyunwoongko commented
Describe a TODO feature
- Lazy Parallelization when
oslo.ready
is called. - This is for Pipeline Parallelism with Tensor Parallelism because tensor parallelization should be performed earlier.
Brief design
model = ...
model = PipelineParallel(model) # --> we only add _PipelineParallel wrapper to model.oslo_wrappers dictionary, but not really parallelize.
model = TensorParallel(model) # --> same with above
oslo.ready(model) # --> we can parallelize tensor -> pipeline
class _TensorParallelism
def __init__(self):
self.oslo_parallel_priority = 1
class _PipelineParallelism
def __init__(self):
self.oslo_parallel_priority = 0
and we can sort parallel wrappers by this variable.
what do you think about this? @ohwi @bzantium @jason9693