[kube-prometheus-stack] AMP remoteWrite sigv4 autorization error: server returned HTTP status 403 Forbidden
nikolaops opened this issue · 0 comments
Describe the bug a clear and concise description of what the bug is.
Prometheus is not authorized to remotely wirte to AWS APS using sigv4 and assuimg role.
What's your helm version?
v3.10.2
What's your kubectl version?
v1.25.4
Which chart?
[prometheus-kube-stack]
What's the chart version?
61.1.1
What happened?
Trying to configure prometheus-kube-stack.prometheus to remotely write to AWS Managed Prometheus.
I have created a role with permissions suggested at aws doc.
Prometheus container in monitoring prometheus-kube-prometheus-stack-prometheus-0 pod gives this error:
ts=2024-07-03T12:22:54.684Z caller=dedupe.go:112 component=remote level=error remote_name=de7ef6 url=https://aps-workspaces.<REGION>.amazonaws.com/workspaces/<WORKSPACE_ID>/api/v1/remote_write/ msg="non-recoverable error" count=1779 exemplarCount=0 err="server returned HTTP status 403 Forbidden: {\"message\":\"Missing Authentication Token\"}"
where and <WORKSPACE_ID> are set correctly to target aws aps.
I have added this to the values:
prometheus:
serviceAccount:
name: "${prometheus_sa}"
annotations:
eks.amazonaws.com/role-arn: "${prometheus_role}"
automountServiceAccountToken: true
prometheusSpec:
remoteWrite:
- url: "https://aps-workspaces.${amp_region}.amazonaws.com/workspaces/${workspace_id}/api/v1/remote_write/"
sigv4:
region: "${amp_region}"
roleArn: "${prometheus_role}"
${prometheus_role} configured via terraform:
resource "aws_iam_role" "prometheus_role" {
name = "${local.cluster_name}-prometheus-remote-amp-write-role"
assume_role_policy = jsonencode({
Version = "2012-10-17",
Statement = [
{
Effect = "Allow",
Principal = {
Federated = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/${data.aws_iam_openid_connect_provider.main.url}"
},
Action = "sts:AssumeRoleWithWebIdentity",
Condition = {
"StringEquals" = {
"${data.aws_iam_openid_connect_provider.main.url}:sub" = "system:serviceaccount:monitoring:${local.prometheus_sa }"
}
}
}
]
})
}
resource "aws_iam_policy" "prometheus_policy" {
name = "${local.cluster_name}-prometheus-remote-amp-write-policy"
description = "Allows Prometheus to write remotely to AMP in us-west-2"
policy = jsonencode({
Version = "2012-10-17",
Statement = [
{
Effect = "Allow",
Action = [
"aps:RemoteWrite",
"aps:GetSeries",
"aps:GetLabels",
"aps:GetMetricMetadata",
"aps:QueryMetrics"
],
Resource = "*"
}
]
})
}
resource "aws_iam_role_policy_attachment" "prometheus_attach" {
role = aws_iam_role.prometheus_role.name
policy_arn = aws_iam_policy.prometheus_policy.arn
}
if i run
kubectl exec -it prometheus-kube-prometheus-stack-prometheus-0 -n monitoring -- cat /var/run/secrets/eks.amazonaws.com/serviceaccount/token
It outputs token. If i decode it, it says that expiration time was yesterday...
Also, serviceaccount that is created upon chart deploy do not have nothing in token section:
Name: <SA_NAME>
Namespace: monitoring
Labels: app=kube-prometheus-stack-prometheus
app.kubernetes.io/component=prometheus
app.kubernetes.io/instance=kube-prometheus-stack
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=kube-prometheus-stack-prometheus
app.kubernetes.io/part-of=kube-prometheus-stack
app.kubernetes.io/version=61.1.1
chart=kube-prometheus-stack-61.1.1
heritage=Helm
release=kube-prometheus-stack
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/<ROLE_ARN>
meta.helm.sh/release-name: kube-prometheus-stack
meta.helm.sh/release-namespace: monitoring
Image pull secrets: <none>
Mountable secrets: <none>
Tokens: <none>
Events: <none>
These are the secrets created by the helm:
NAME TYPE DATA AGE
alertmanager-kube-prometheus-stack-alertmanager Opaque 1 15m
alertmanager-kube-prometheus-stack-alertmanager-generated Opaque 1 15m
alertmanager-kube-prometheus-stack-alertmanager-tls-assets-0 Opaque 0 15m
alertmanager-kube-prometheus-stack-alertmanager-web-config Opaque 1 15m
kube-prometheus-stack-admission Opaque 3 155m
prometheus-kube-prometheus-stack-prometheus Opaque 1 15m
prometheus-kube-prometheus-stack-prometheus-tls-assets-0 Opaque 1 15m
prometheus-kube-prometheus-stack-prometheus-web-config Opaque 1 15m
sh.helm.release.v1.kube-prometheus-stack.v1 helm.sh/release.v1 1 15m
Deployed aslo via terraform as:
resource "helm_release" "prometheus_stack" {
name = "kube-prometheus-stack"
repository = "https://prometheus-community.github.io/helm-charts"
chart = "kube-prometheus-stack"
namespace = "monitoring"
create_namespace = true
version = "61.1.1"
values = [
"${templatefile("${path.module}/template_files/prometheus.yaml.tftpl", {
apps_locals = local.apps_locals,
prometheus_sa = local.prometheus_sa,
amp_region = var.amp_region,
workspace_id = var.amp_centralized_workspace_id,
prometheus_role = local.prometheus_role
})}"
]
}
What you expected to happen?
Metrics remotely wrote to the APS workspace.
How to reproduce it?
No response
Enter the changed values of values.yaml?
No response
Enter the command that you execute and failing/misfunctioning.
terraform apply
Anything else we need to know?
No response