Splunk Operator: splunk_indexer : Remove existing HEC token results in failed indexer pod startup
Opened this issue · 1 comments
gjanders commented
Please select the type of request
Bug
Tell us more
Describe the request
- I'm upgrading from splunk operator 2.2.0 to version 2.5.2, and also attempting to use Splunk 9.1.4.
Expected behavior
- The indexer pods should start without an error
Splunk setup on K8S
- Multisite cluster with cluster manager
Reproduction/Testing steps
- When the pods are upgraded they throw an error, rolling back the splunk image to version docker.io/splunk/splunk:9.0.3-a2 stops the issue occurring so I'm unsure if this is an issue in the docker image or operator or a combination.
Additional context(optional)
In the operator I used:
image:
repository: docker.io/splunk/splunk:9.1.4
splunkOperator:
enabled: true
clusterWideAccess: true
# Specify volumes for Splunk Operator pod, append additional volumes to list
# reference: https://kubernetes.io/docs/concepts/storage/volumes/
volumes:
- name: app-staging
persistentVolumeClaim:
claimName: splunk-operator-app-download
# Specify volume mounts for the manager container, append additional volume mounts to list
# reference: https://kubernetes.io/docs/tasks/configure-pod-container/configure-volume-storage/
volumeMounts:
- mountPath: /opt/splunk/appframework/
name: app-staging
The logs show:
│ TASK [splunk_indexer : Remove existing HEC token] ****************************** │
│ fatal: [localhost]: FAILED! => { │
│ "changed": false, │
│ "elapsed": 0, │
│ "redirected": false, │
│ "status": -1, │
│ "url": "https://127.0.0.1:8089/services/data/inputs/http/splunk_hec_token", │
│ "warnings": [ │
│ "Module did not set no_log for password" │
│ ] │
│ } │
│ │
│ MSG: │
│ │
│ Status code was -1 and not [200, 404]: Request failed: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1091)> │
│ │
│ PLAY RECAP ********************************************************************* │
│ localhost : ok=82 changed=8 unreachable=0 failed=1 skipped=61 rescued=0 ignored=0 │
I've tested adding SSL certificates into the deployment without success so far.
The cluster manager pod doesn't seem to have an issue here, only the indexer pods
Under defaults: I tested:
config:
env:
verify: false
And also setting SSL config via;
defaults:
splunk:
ssl:
ca: /mnt/peers-splunk-ca/tls.crt
cert: /mnt/peers-splunk-cert/tls.crt
Without any success
satellite-no commented
We believe this to be due to the verify
flag in the underlying splunk-ansible configuration steps. Open PR https://github.com/splunk/splunk-ansible/pull/818/files is to address this issue.