signalfx/splunk-otel-collector-chart

Helm chart not working if profiling and operator enabled without further configuration.

conrad784 opened this issue · 2 comments

What happened?

Description

Upgrading from 0.84 to 0.85 leads to errors in our deployment.
There is a "breaking change" but this should according to the changelog only impact if we would specify our instrumentation images. We are not using any custom instrumentation images.
We have enables profiling and also enabled the operator without any kustomizations.

Steps to Reproduce

wget https://github.com/signalfx/splunk-otel-collector-chart/releases/download/splunk-otel-collector-0.85.0/splunk-otel-collector-0.85.0.tgz
tar xfz splunk-otel-collector-0.85.0.tgz

helm template splunk-otel-collector splunk-otel-collector --namespace otel -f otel.yml --debug

Expected Result

rendered helm chart yaml

Actual Result

install.go:200: [debug] Original chart version: ""
install.go:217: [debug] CHART PATH: /tmp/otel-debug/splunk-otel-collector

Error: template: splunk-otel-collector/templates/operator/instrumentation.yaml:30:11: executing "splunk-otel-collector/templates/operator/instrumentation.yaml" at : wrong number of args for not: want 1 got 3
helm.go:84: [debug] template: splunk-otel-collector/templates/operator/instrumentation.yaml:30:11: executing "splunk-otel-collector/templates/operator/instrumentation.yaml" at : wrong number of args for not: want 1 got 3

Chart version

0.85.0

Environment information

Environment

helm version
version.BuildInfo{Version:"v3.12.3", GitCommit:"3a31588ad33fe3b89af5a2a54ee1d25bfe6eaa5e", GitTreeState:"clean", GoVersion:"go1.20.7"}
OS: MacOS ARM + Linux x86

Chart configuration

# test.yml 
---
cloudProvider: "gcp"
distribution: ""
clusterName: "test"
splunkPlatform:
  endpoint: "https://example.com"

splunkObservability:
  realm: "eu0"

  profilingEnabled: true

  # Options to disable or enable particular telemetry data types.
  metricsEnabled: true
  tracesEnabled: true
  logsEnabled: false

gateway:
  enabled: false

tolerations:
  - operator: "Exists"
    effect: "NoExecute"
  - operator: "Exists"
    effect: "NoSchedule"

agent:
  config:
    exporters:
      signalfx:
        include_metrics:
          - metric_names:
            - k8s.node.cpu.utilization
            - k8s.pod.cpu.utilization
            - k8s.volume.available
            - k8s.volume.capacity
    receivers:
      kubeletstats:
        metric_groups:
          - container
          - pod
          - node
          - volume
        extra_metadata_labels:
          - container.id
          - k8s.volume.type
  resources:
    limits:
      cpu: 200m
      memory: 500Mi
  securityContext:
    runAsUser: 20000
    runAsGroup: 20000

clusterReceiver:
  enabled: true
  resources:
    limits:
      cpu: 200m
      memory: 500Mi

secret:
  create: false
  name: splunk-otel-collector

logsEngine: otel

# should be disabled in favour of https://github.com/signalfx/splunk-otel-collector-chart/issues/689
# as soon as available
autodetect:
  prometheus: true

# in order to use traces with the kind: Instrumentation CRDs
operator:
  enabled: true

environment: test

Log output

install.go:200: [debug] Original chart version: ""
install.go:217: [debug] CHART PATH: /tmp/otel-debug/splunk-otel-collector


Error: template: splunk-otel-collector/templates/operator/instrumentation.yaml:30:11: executing "splunk-otel-collector/templates/operator/instrumentation.yaml" at <not>: wrong number of args for not: want 1 got 3
helm.go:84: [debug] template: splunk-otel-collector/templates/operator/instrumentation.yaml:30:11: executing "splunk-otel-collector/templates/operator/instrumentation.yaml" at <not>: wrong number of args for not: want 1 got 3

Additional context

Fixing the first issue with

--- a/splunk-otel-collector/templates/operator/instrumentation.yaml
+++ b/splunk-otel-collector/templates/operator/instrumentation.yaml
@@ -27,11 +27,11 @@ spec:
     {{- include .Values.operator.instrumentation.spec.env }}
     {{- end }}
     {{- if .Values.splunkObservability.profilingEnabled }}
-    {{- if not hasKey (include "splunk-otel-collector.operator.extract-name-keys-from-dict-list" .Values.operator.instrumentation.spec.env) "SPLUNK_PROFILER_ENABLED" }}
+    {{- if not (hasKey (include "splunk-otel-collector.operator.extract-name-keys-from-dict-list" .Values.operator.instrumentation.spec.env) "SPLUNK_PROFILER_ENABLED") }}
     - name: SPLUNK_PROFILER_ENABLED
       value: "true"
     {{- end }}
-    {{- if not hasKey (include "splunk-otel-collector.operator.extract-name-keys-from-dict-list" .Values.operator.instrumentation.spec.env) "SPLUNK_PROFILER_MEMORY_ENABLED" }}
+    {{- if not (hasKey (include "splunk-otel-collector.operator.extract-name-keys-from-dict-list" .Values.operator.instrumentation.spec.env) "SPLUNK_PROFILER_MEMORY_ENABLED") }}
     - name: SPLUNK_PROFILER_MEMORY_ENABLED
       value: "true"
     {{- end }}

Makes another error appear in the helper doing some magic with dictionary reversing.

Error: template: splunk-otel-collector/templates/operator/instrumentation.yaml:30:105: executing "splunk-otel-collector/templates/operator/instrumentation.yaml" at <.Values.operator.instrumentation.spec.env>: wrong type for value; expected map[string]interface {}; got string
helm.go:84: [debug] template: splunk-otel-collector/templates/operator/instrumentation.yaml:30:105: executing "splunk-otel-collector/templates/operator/instrumentation.yaml" at <.Values.operator.instrumentation.spec.env>: wrong type for value; expected map[string]interface {}; got string

Hi @jvoravong, can we please have a new release including your recently merged changes? This is blocking us from upgrading clusters for 1.24 to 1.25.