All cluster resources being claimed by actors ?
Chuukwudi opened this issue · 0 comments
Chuukwudi commented
On the notebook, calling
# Embed chunks
embedding_model_name = "thenlper/gte-base"
embedded_chunks = chunks_ds.map_batches(
EmbedChunks,
fn_constructor_kwargs={"model_name": embedding_model_name},
batch_size=100,
num_gpus=1,
compute=ActorPoolStrategy(size=2))
# Sample
sample = embedded_chunks.take(1)
results to:
======== Autoscaler status: 2023-09-19 10:15:05.945390 ========
Node status
---------------------------------------------------------------
Healthy:
1 node_39e554d28e4f63b9d3360ffdf267014a901a29d1601c039967717f26
Pending:
(no pending nodes)
Recent failures:
(no failures)
Resources
---------------------------------------------------------------
Usage:
1.0/32.0 CPU
1.0/1.0 GPU
0B/10.09GiB memory
11.70MiB/5.05GiB object_store_memory
Demands:
{'CPU': 1.0, 'GPU': 1.0}: 1+ pending tasks/actors
(autoscaler +2m17s) Warning: The following resource request cannot be scheduled right now: {'CPU': 1.0, 'GPU': 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.
(autoscaler +2m52s) Warning: The following resource request cannot be scheduled right now: {'CPU': 1.0, 'GPU': 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.
Any solution ?
I have tried changing ActorPoolStrategy to size 1 and reducing batch_size yet the same old story.