Set up and use pandas for off-core compute
In order to do meaningful compute, we will need a remote cluster. There are a number of ways to do this, but following current trends we will aim to deploy using Kubernetes
and for the sake of my sanity, we will throw that at Azure. It is important to note that everything we do should run locally (but then why are we getting off of Pandas?), and more importantly should work on any modern cloud compute platform (wherever Kubernetes
is welcome).
Let's get AKS working ...
- Get a Docker Hub account!
- Install the Azure CLI
- JDK 8, get it ready
- Scala Build Tool (SBT)