Kustomize and supporting scripts for running TrinoDb (prestoSQL) and a Hive 3 Metastore in Kubernetes using S3 object storage and MySQL.
Author: Timothy C. Arland
Email: tcarland@gmail.com
- Kubernetes >= 1.23 - Suggested version: 1.25+
- Kustomize >= v5 - Suggested version: v5.4.1
The project depends on a number of environment variables for deploying the necessary configuration via the setup script. S3 Credentials are the primary variables required, with others having default values if not provided. The following table defines the list of variables used by the setup script.
Environment Variable | Description | Default Setting |
---|---|---|
S3_ENDPOINT | The S3 endpoint url | http(s)://minio.minio.svc |
S3_ACCESS_KEY | The S3 access key | |
S3_SECRET_KEY | The S3 secret key | |
---------------- | ------------------------- | ------------------- |
TRINO_NAMESPACE | Namespace for deploying the components | trino |
MYSQLD_USER | Name of the hive mysql db user | root |
MYSQLD_ROOT_PASSWORD | Password for the mysql root user | randomized-password |
The metastore image is based off of Hive version 3.1.3 and can be
built using the provided hive3/Containerfile.
$ cd containerfiles/hive3 && docker build . project/hive:3.1.3
Ensure all variables above are defined and exported to the environment. Passing an argument to the script will show the configuration only and can be used to verify the settings.
./bin/trino-k8s-setup.sh -e
Run the setup script to configure the various config templates.
./bin/trino-k8s-setup.sh
Copy the env or inherit all vars to the current environment.
eval $(./bin/trino-k8s-setup.sh)
Deploy the MySQL Server via Kustomize.
kustomize build mysql-server/ | kubectl apply -f -
The same Mysql image can be used as a client.
docker run -it --rm mysql mysql -hsome.mysql.host -usome-mysql-user -p
We deploy the metastore in the same manner, using Kustomize.
kustomize build hive-metastore/ | kubectl apply -f -
Note this includes the init job hive-init-schema.yaml that was generated by the setup script. This job will run the Hive schematool for provisioning the database.
Verify the parameter substitution is correct in trino/base/configmap.yaml as generated by the trino-k8s-setup.sh script.
Load the Trino manifests.
kustomize build trino/ | kubectl apply -f -
Enable external access to the coordinator via LoadBalancer, if necessary (the
trino-coordinator-service may already be set to type: LoadBalancer
).
This requires MetalLB or other ELB support in K8s.
kubectl patch service trino-coordinator-service -n trino -p '{"spec": {"type": "LoadBalancer"}}'
Get the external IP of the Trino Coordinator
kubectl get svc trino-coordinator-service -n trino --no-headers | awk '{ print $4 }'
The secrets needed for the components are written to **/base/secrets.env for kustomize
to consume on build and should be cleaned up after deployment by running make clean
.
Trino CLI can be acquired here
trino-cli --server 172.17.0.210:8080 --catalog hive --schema default
The JDBC Driver can be acquired from the Maven Central Repository. The current deployment has been tested with trino-451.