mapsacosta/i2gpu_gketests

Shell

i2gpu_gketests

Publications

Links

Introduction

Tests were performed from several client styles (Cloud-Cloud FNAL-Cloud MIT-Cloud) to a TRT server deployed on GKE

ailab-inference GKE cluster

Master version: 1.15.9-gke.24
Master zone: us-central1-a
Node zones: us-central1-a, us-central1-f
VPC-native (alias IP):Enabled
Pod address range: 10.24.0.0/14
Default maximum pods per node: 110
Service address range: 10.0.0.0/20

GKE Node pools

Use	Pool Name	Machine type	OS	GPU model	# GPUs	Instance template
Kubernetes core workloads	cpuonly-pool	n1-standard-4	Container-Optimized OS (cos)	N/a	N/a	gke-ailab-inference-cpuonly-pool-0f0dbec1
ProtoDune	gpu-4-t4-boot	n1-standard-4	Container-Optimized OS (cos)	NVIDIA Tesla T4	4	gke-ailab-inference-gpu-4-t4-boot-14e2482b
CMS	gpu-v100-cos-himem	n1-standard-16	Container-Optimized OS (cos)	NVIDIA Tesla V100	4	gke-ailab-inference-gpu-v100-cos-hime-42b578a6
CMS	gpu-8-v100-cos-himem	n1-standard-16	Container-Optimized OS (cos)	NVIDIA Tesla V100	8	gke-ailab-inference-gpu-8-v100-cos-hi-2a214dda

GCE Instance Groups

Node pool	Instance grp (central-1f)	Instance grp (central-1a)	Template
cpuonly-pool	gke-ailab-inference-cpuonly-pool-1615bf50-grp	gke-ailab-inference-cpuonly-pool-43548431-grp	gke-ailab-inference-cpuonly-pool-0f0dbec1
gpu-4-t4-boot	gke-ailab-inference-gpu-4-t4-boot-5e8115ec-grp	gke-ailab-inference-gpu-4-t4-boot-14e2482b-grp	gke-ailab-inference-gpu-4-t4-boot-14e2482b
gpu-8-v100-cos-himem	gke-ailab-inference-gpu-8-v100-cos-hi-f6c70a6f-grp	gke-ailab-inference-gpu-8-v100-cos-hi-2a214dda-grp	gke-ailab-inference-gpu-v100-cos-hime-42b578a6
gpu-v100-cos-himem	gke-ailab-inference-gpu-v100-cos-hime-57e34068-grp	gke-ailab-inference-gpu-v100-cos-hime-42b578a6-grp	gke-ailab-inference-gpu-8-v100-cos-hi-2a214dda

Public endpoints

In use by	Public LB IP	GPU type	Ports
ProtoDUNE	34.70.127.245	T4	8000,8001,8002
CMS	35.224.243.148	V100	8000,8001,8002