/gke-ai-observability

Demo of observability setup for AI workloads on GKE

Primary LanguageShellApache License 2.0Apache-2.0

gke-ai-observability

Demo of observability setup for AI workloads on GKE

img.png img.png

https://github.com/lllyasviel/Fooocus

Fooocus Observability Demo

  1. Create GKE cluster e.g. standard one in e.g. us-central1 region.

    bash fooocus/1-patched.sh

    Then follow up with the pool:

    bash fooocus/2.sh
  2. After cluster is up, setup kubectl

    gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${REGION}
  3. Add more observability (soon will be automatic with monitoring packages)

    # DCGM exporter, will be automatically done for you soon with DCGM package.
    kubectl apply -f fooocus/dcgm-monitoring.yaml
    # Same here, should be part of KUBELET,CADVISOR soon.
    kubectl apply -f fooocus/gcm-cadvisor.yaml
  4. Install Fooocus inference server on your cluster, instrumented for HTTP metrics with eBPF

    If you want UI (without REST API):

    kubectl apply -f fooocus/server-instrumented-ui.yaml

    If you want without UI (REST API), see fooocus/stress.sh on how to access it.

    kubectl apply -f fooocus/server-instrumented-rest.yaml
  5. Wait for it to come up, in the meantime you can check the logs:

    kubectl logs -f -l app=fooocus

Unfortunately it's configured to download GBs (in total) of deps and models on start.

  1. Setup port-forwarding

    For UI:

    kubectl port-forward service/fooocus 3000:3000

    For REST:

    kubectl port-forward service/fooocus 3000:8088
  2. Open 3000 port

  3. Enjoy, create your own dashboard or e.g. import Grafana dashboard like https://grafana.com/grafana/dashboards/12239-nvidia-dcgm-exporter-dashboard/ into the GCM