Alerting seems to use a different AWS provider chain than other parts of the app
Closed this issue · 5 comments
What happened?
When Grafana is deployed using EKS Pod Identity and using the OpenSearch plugin, attempting to create / preview an alert generates the following error:
[sse.dataQueryError] failed to execute query [A]: [plugin.downstreamError] client: failed to query data:
Failed to query data: rpc error: code = Unknown desc = OpenSearch data source error:
Post "https://opensearch.xxxi/_msearch?max_concurrent_shard_requests=5": NoCredentialProviders: no valid providers in chain.
Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors
The plugin works just fine when you create and test it, as well as making dashboard panels - the issue only shows up once you start trying to use alerting for it.
What did you expect to happen?
I expect the alerting side of things to be no different from an authentication point of view as panel or data source creation.
Did this work before?
Unknown.
I had a suspicion that it might be related to EKS Pod Identity, so I switched to IRSA and it works as expected. I assume whatever SDK is being used in the alerting subsystem is not up to date, as Pod Identity works fine everywhere else.
How do we reproduce it?
- Deploy Grafana using EKS Pod Identity
- Create an OpenSearch data source using the plugin, making sure to enable SigV4. It should successfully save and test.
- Create a basic panel using this data source. It should return values as expected.
- Save the dashboard, then go into the alerting tab of the panel you just created, then try to create and preview an alert. You will get the error previously mentioned.
Is the bug inside a dashboard panel?
No response
Environment (with versions)?
Grafana: v11.1.0 (5b85c4c2fc)
OS: Running in EKS using the official helm chart, version 8.3.6
Browser: Chrome 126.0.6478.182
Grafana platform?
Kubernetes
Datasource(s)?
Opensearch 2.17.1
I'm pretty sure this should be moved to https://github.com/grafana/opensearch-datasource since plugin.downstreamError
indicates the error is originating from the datasource in question.
@grafana/aws-datasources please triage/verify
Hi @mthemis-provenir we just released a patch with an update to AWS SDK in the plugin to the version that should support Pod ID, can you update your plugin and check if this fixes your problem? Thanks!
Thanks @idastambuk - I won't be able to test until later as our users are using it currently, however will let you know if it fixes the issue or not this evening.
Yep, that's fixed it. Thanks @idastambuk!
@mthemis-provenir thats great, thanks for letting us know!