google/caliban

AWS Backend, similar to `caliban cloud`'s AI Platform support

sritchie opened this issue · 4 comments

There's no reason we can't build an AWS backend, to make it easier to users to work with whatever tooling they currently have.

I'm planning on spending some time organizing the codebase to make this easier. The big missing pieces that I'd love help with from the community are:

  • what AWS service is analogous to AI Platform? We want to submit some request with a Docker image ID, command line arguments, and hardware specs (GPUs, machine type etc), and have some AWS service run the job and then stop. Bonus if we can attach labels, etc
  • What authentication method is similar to Google' Service Account Key?

We have two auth requirements.

  1. We need to authenticate with AWS to submit the job from the submitting machine.
  2. We'd like to bake some credentials into the container so that users can authenticate with AWS's python library or command line interface and, say, transfer data to and from buckets, or talk to some AWS database.

For (2), amazon might mount credentials into the container, or we might have to grab them and bake them in, like we do with service account keys.

If someone would comment here with a rough guide (but the more detailed the better)! on how to do either of the above 2 manually, that would be a massive help in automating this.

You might want to take a look at what we did in cloudknot. For our purposes, the target AWS service is AWS Batch.

For your auth requirements, you can pass keys as arguments to functions. See for example here

Thanks, @arokem , I absolutely will. This is really helpful!

what AWS service is analogous to AI Platform? We want to submit some request with a Docker image ID, command line arguments, and hardware specs (GPUs, machine type etc), and have some AWS service run the job and then stop. Bonus if we can attach labels, etc

I think that AWS SageMaker is the equivalent for Google AI plattform on the AWS side.

@sritchie
I'm looking to get involved with this project and I wanted to know if this request was still open and what you needed.