[Data] Ray Data continues autoscaling even when pipeline is backpressured by iteration
Opened this issue · 0 comments
bveeramani commented
What happened + What you expected to happen
I'm doing training, and my compute config looks like this:
My cluster autoscales CPU nodes and eventually GPU nodes to process more data, even though my trainer doesn't need more data.
Versions / Dependencies
2.21
Reproduction script
import ray
import numpy as np
import time
def generate_block(row):
return {"data": np.zeros((128 * 1024 * 1024,), dtype=np.uint8)}
ds = ray.data.range(1000, override_num_blocks=1000).map(generate_block)
for block in ds.iter_batches(batch_size=None):
time.sleep(5)
Issue Severity
Medium: It is a significant difficulty but I can work around it.