Cluster scaling up due to max volume count
dbpolito opened this issue · 3 comments
My cluster scaled up due to Volume Mount Limit:
Warning FailedScheduling <unknown> default-scheduler 0/2 nodes are available: 2 node(s) exceed max volume count.
Warning FailedScheduling <unknown> default-scheduler 0/2 nodes are available: 2 node(s) exceed max volume count.
Warning FailedScheduling <unknown> default-scheduler 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) exceed max volume count.
Normal Scheduled <unknown> default-scheduler Successfully assigned <container> to <pool-instance>
I took a look at https://www.digitalocean.com/docs/volumes/#limits and this is not clear to me.
Everything worked, volumes were created and my deploy worked fine but i didn't expect my cluster to scale up due to number of volumes mounted.
Is there something i can do or configure to avoid that?
@dbpolito to make sure I understand your issue correctly: are you saying that you found your cluster scaling up due to supposed volumes-per-node limitation but you think it shouldn't have done so; or do you think the upscale was justified to allow scheduling but you'd rather see the cluster not autoscale (even if it means a to-be-scheduled pod stays stuck)?
I'll try to answer generally for now: we currently do not expose any knobs around tweaking the autoscaler (and to be honest, I don't know off the top of my head if settings exist for volume-related decision making). Each node can mount a maximum of 7 volumes; if you find your workload not being distributed evenly to take full advantage of your total volume capacity, you may want to set node / pod (anti-)affinity rules to steer your pods onto the right nodes.
Happy to learn more about your specific use case so that we can discuss options.
Each node can mount a maximum of 7 volumes
So this is a limitation on DO for attaching volumes in Droplets:
I was trying to understand the limit, while researching i found the volume list was defined by the CSI Driver, so i was wondering if it was somehow configurable.
This is indeed a limitation that i will need to change in my use case as i expect to have many volumes...
This volume limit plus the ReadWriteMany
limitation i might go with something like https://rook.io/
Thank you.
@dbpolito indeed, this is a limitation enforced by the DO storage backend. Rook.io sounds like an interesting alternative if you cannot or do not want to work around the limitation through scheduling.