canonical/dex-auth-operator

Integration tests fail in CI

natalian98 opened this issue · 2 comments

Integration tests for prometheus and grafana integration in CI will occasionally fail.

Same PR tests ran two times:

The 1st failed because of response_metric = response["data"]["result"][0]["metric"] IndexError: list index out of range which implies that dex-auth is unavailable even though the juju status shows the app as active.

The following is the content of response_metric captured while running the test on an ec2 instance:

{'__name__': 'up', 'instance': 'test-kfp-update_b6d8e8b0-62f4-429b-8a4b-2dc58204fe88_dex-auth_dex-auth/0', 'job': 'juju_test-kfp-update_b6d8e8b_dex-auth_dex-auth_prometheus_scrape', 'juju_application': 'dex-auth', 'juju_charm': 'dex-auth', 'juju_model': 'test-kfp-update', 'juju_model_uuid': 'b6d8e8b0-62f4-429b-8a4b-2dc58204fe88', 'juju_unit': 'dex-auth/0'}

This test never failed locally or on an ec2 instance. Similar issue was observed for kfp-api integration.

Another example:
This PR run passed the kfp-profile-controller integration tests while the next one failed due to non-existent service account. The changes in that commit were not related to kfp-profile-controller and shouldn't impact this test.
I re-run the failed test and this time all passed.

Since this was a transient error that depends on the time interval settings of the failing test, and you've already modified that, we can close this issue. Feel free to re-open it if it happens again.