gke-ai-observability

Demo of observability setup for AI workloads on GKE

Fooocus Observability Demo

Create GKE cluster e.g. standard one in e.g. us-central1 region.
```
bash fooocus/1-patched.sh
```
Then follow up with the pool:
```
bash fooocus/2.sh
```

After cluster is up, setup kubectl

gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${REGION}

Add more observability (soon will be automatic with monitoring packages)

# DCGM exporter, will be automatically done for you soon with DCGM package.
kubectl apply -f fooocus/dcgm-monitoring.yaml
# Same here, should be part of KUBELET,CADVISOR soon.
kubectl apply -f fooocus/gcm-cadvisor.yaml

Install Fooocus inference server on your cluster, instrumented for HTTP metrics with eBPF

If you want UI (without REST API):
```
kubectl apply -f fooocus/server-instrumented-ui.yaml
```
If you want without UI (REST API), see fooocus/stress.sh on how to access it.
```
kubectl apply -f fooocus/server-instrumented-rest.yaml
```
Wait for it to come up, in the meantime you can check the logs:
```
kubectl logs -f -l app=fooocus
```

Unfortunately it's configured to download GBs (in total) of deps and models on start.

Setup port-forwarding

For UI:

kubectl port-forward service/fooocus 3000:3000

For REST:

kubectl port-forward service/fooocus 3000:8088

Open 3000 port
Enjoy, create your own dashboard or e.g. import Grafana dashboard like https://grafana.com/grafana/dashboards/12239-nvidia-dcgm-exporter-dashboard/ into the GCM

bwplotka/gke-ai-observability

gke-ai-observability

Fooocus Observability Demo