pangeo-data/jupyter-earth

Evaluate ability to optimize user environment for use with Ray

Opened this issue · 6 comments

@espg asked me on Slack:

somewhat related to block storage... is it possible for us to setup/increase /dev/shm on the instances? I think this can be done in docker maybe, according to this github issue: aws/amazon-ecs-agent#787

Right now /dev/shm is set to 64MB; looking at ray, ray uses redis for shared memory objects, and assumes a ram disk mounted at /dev/shm to use to put read-only objects that can be shared across processes

if /dev/shm is small it will default to /tmp; which if block storage is setup and the ssd mount to /tmp, isn't all that bad. But if /tmp is EBS mounted, it will not work

I think with the high memory node, we could probably allocated a decent amount to /dev/shm, and that might be a halfway solution for the types of things that block storage is needed for if it's an easier setup

Here's the warning btw from ray on this with the docker config suggested fix:

image

I need to look into this and evaluate if this is sensible to try resolve in our k8s context, and if so, how to do it in a sensible way.

espg commented

Possibly relevant-- appears to have support in aws for respecting the shm flag vis-a-vi docker containers:

https://aws.amazon.com/about-aws/whats-new/2018/03/amazon-ecs-adds-support-for-shm-size-and-tmpfs-parameters/

They say use at least 30% of ram; so for our massive high memory, I think 300GB, although 350 is probably safer to pad a bit...

espg commented

One more link for this-- on the kubernetes side, seems like this is possible but not well documented atm:

cri-o/cri-o#4174

espg commented

Ray has always worked locally on the jupyter-hub instance, but if we're looking into it as a tractable replacement for dask, it would be helpful to try it as a proper cluster. @consideRatio can we look into a basic configuration to set this up for ray? There's some documentation on the helm setup here: https://docs.ray.io/en/latest/cluster/kubernetes.html

@espg thanks for the followup and sorry for being slow to respond on this. I've worked on dask-gateway recently and learned a lot more about things in general that gives me a bit of an overview navigating the distributed ecosystem of tools.

I've arrived at a classification of some sorts of distributed setups, and what can be sustainable to push development efforts towards.

  1. Connect to a fixed set of existing workers
    Example: the dask helm chart in https://github.com/dask/helm-chart
  2. Use software with powerful permissions to directly create workers on demand as k8s Pods
    Example: the dask-kubernetes project in https://github.com/dask/dask-kubernetes
  3. Use software with restricted permissions to indirectly create workers on demand via a custom k8s resource
    Example: the dask-operator project developed in dask/dask-kubernetes#392
  4. Use software with restricted permissions to even more indirectly create workers on demand via a "gateway" you authenticate against to create workers on demand.
    Example: the dask-gateway project developed in https://github.com/dask/dask-gateway

For the Ray ecosystem, there is a "ray operator" mapping to level 3, and that may be sufficient. But, the ray operator is one part, is there also a ray client that can interact effectively?

Operators or controllers in k8s mean that they defined a custom resource kind, and if someone creates/edits/deletes these resources the controller will do things based on that - such as create 1 scheduler and X workers if a DaskCluster resource is created. But, is there software that helps users not work with kubectl to create these resources that the operator/controller will inspect and create workers based on? Unless that is the case, there is no point of installing an operator etc.

I'll set out to learn about if there a Ray client to work against a k8s cluster with a Ray operator running within it. I'll then also try to verify that if there is such client to interact with an operator, that it can be adjusted to help shut down the workers if the jupyter server that created them is shut down, or if there is another protection mechanism from starting but forgetting to stop workers that otherwise could live on forever.

espg commented

...perhaps this is something that one of the ray core devs can answer for us? @robertnishihara @ericl @richardliaw the context here is that we're looking into replacing dask with ray as the cluster compute backend for the jupyter meets the earth as part of the 2i2c project. Any insight on @consideRatio 's k8s scaling questions?

ericl commented

Hey @espg @consideRatio indeed "level 3" is supported via the Kuberay project: https://github.com/ray-project/kuberay

I'll set out to learn about if there a Ray client to work against a k8s cluster with a Ray operator running within it.

We also have a Ray client available that I think could be used to implement "level 4": https://docs.ray.io/en/latest/cluster/ray-client.html

that it can be adjusted to help shut down the workers if the jupyter server that created them is shut down,

There isn't anything "out of the box" that does this, but you could potentially track if there is an active connection from the client, or use an API like ray.state.jobs() to check if there are any live jobs on the cluster.

Hope this helps! If you have more questions about how to do this, the https://discuss.ray.io/c/ray-clusters/13 forum is fairly responsive.