Altinity/clickhouse-operator

Clickhousekeeperinstallation dont have dedicated data volume (PVC)

meektechie opened this issue · 13 comments

Clickhousekeeperinstallation dont have dedicated data volume (PVC) and it uses Overlay volume when pod rotated It lost the clickhouse metadata. Is there a particular reason for running the clickhousekeeper pods without dedicated volumes?
Screenshot 2024-11-19 at 2 45 59 PM

Error:
last_error_message: Table is in readonly mode since table metadata was not found in zookeeper:

Hi! Try to add dataVolumeClaimTemplate: default here:

...
  defaults:
    templates:
      # Templates are specified as default for all clusters
      podTemplate: default
+     dataVolumeClaimTemplate: default

  templates:
    podTemplates:
...

@bakavets Got below response. my operator chart version is 0.23.7

Error from server (BadRequest): error when creating "chk.yaml": ClickHouseKeeperInstallation in version "v1" cannot be handled as a ClickHouseKeeperInstallation: strict decoding error: unknown field "spec.defaults"

Slach commented

@meektechie upgrade your CRDS separatelly if you use helm, and upgrade operator helm chart to 0.24.0

  kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/master/deploy/helm/clickhouse-operator/crds/CustomResourceDefinition-clickhouseinstallations.clickhouse.altinity.com.yaml
  kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/master/deploy/helm/clickhouse-operator/crds/CustomResourceDefinition-clickhouseinstallationtemplates.clickhouse.altinity.com.yaml
  kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/master/deploy/helm/clickhouse-operator/crds/CustomResourceDefinition-clickhouseoperatorconfigurations.clickhouse.altinity.com.yaml
  kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/master/deploy/helm/clickhouse-operator/crds/CustomResourceDefinition-clickhousekeeperinstallations.clickhouse-keeper.altinity.com.yaml

DO NOT use CHK with operator 0.23.7. It was experimental in that release (as stated in release notes). The GA support is in 0.24.x, and it is not-compatible with a previous version. See migration guide, it is not trivial to migrate: https://github.com/Altinity/clickhouse-operator/blob/master/docs/keeper_migration_from_23_to_24.md

@alex-zaitsev Whether the CRD having a facility of using private images?. With the previous CRD we had it.

Slach commented

@meektechie did you try

apiVersion: v1
kind: Secret
metadata:
  name: my-registry-secret
type: kubernetes.io/dockerconfigjson
stringData:
  .dockerconfigjson: |
    {
      "auths": {
        "<registry-url>": {
          "username": "<your-username>",
          "password": "<your-password>",
          "email": "<your-email>",
          "auth": "<base64-encoded-credentials>"
        }
      }
    }

---
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseKeeperInstallation
meta:
 name: custom-image
spec:
  defaults:
    templates:
      podTemplate: private-image
  templates:
    podTemplates:
    - name: private-image
      spec:
        imagePullSecrets:   
        - name: image-pull-secret  
        containers:
        - name: clickhouse
          image: you-registry/repo/clickhouse-server:tag

?

@Slach This is my pod template, with 0.23.7 it was working perfectly but with 0.24.0 it is not working. I went through the CRD but with the CRD also not there,

    podTemplates:
    - metadata:
        creationTimestamp: null
      name: default
      spec:
        containers:
        - image: pvt/clickhouse-keeper:24.3.13.40-alpine
          imagePullPolicy: IfNotPresent
          name: clickhouse-keeper
          resources:
            limits:
              cpu: 500m
              memory: 1Gi
            requests:
              cpu: 200m
              memory: 501Mi
        imagePullSecrets:
        - name: dockerhub
      ------

From STS generated by clickhousekeeprinstallation
spec:
containers:
- env:
- name: CLICKHOUSE_DATA_DIR
value: /var/lib/clickhouse-keeper
image: clickhouse/clickhouse-keeper:latest
imagePullPolicy: Always

Slach commented

@meektechie try 0.24.1, i checked manifest and got

  imagePullSecrets:
  - name: image-pull-secret

as expected

@Slach I hope 0.24.0 the latest release. While trying 0.24.1 it throws me an error.

Error: can't get a valid version for repositories altinity-clickhouse-operator. Try changing the version constraint in Chart.yaml

Slach commented

@meektechie

git checkout https://github.com/Altinity/clickhouse-operator.git
cd clickhouse-operator
git fetch
git checkout 0.24.1
helm install -n <your-namespace> <your-release-name> ./deploy/helm/clickhouse-operator/

@Slach Thanks for the immediate response. Let me check & update here

@Slach I have deployed 0.24.1, but still the clickhouse-keeper pods gets "clickhouse/clickhouse-keeper:latest" image. Below the configuration for "chk" and the STS generated through the CR "chk".

kubectl -n ch get chk clickhouse-keeper -o yaml | grep image

    - image: mypvtrepo/clickhouse-keeper:24.3.13.40-alpine
      imagePullPolicy: IfNotPresent

kubectl -n ch get sts chk-clickhouse-keeper-keeper-0-0 -o yaml | grep image
image: clickhouse/clickhouse-keeper:latest
imagePullPolicy: Always

chk.yaml

apiVersion: clickhouse-keeper.altinity.com/v1
kind: ClickHouseKeeperInstallation
  name: clickhouse-keeper
spec:
  configuration:
    clusters:
    - layout:
        replicasCount: 3
      name: keeper
    settings:
      keeper_server/coordination_settings/raft_logs_level: information
      keeper_server/four_letter_word_white_list: '*'
      keeper_server/raft_configuration/server/port: "9444"
      keeper_server/storage_path: /var/lib/clickhouse-keeper
      keeper_server/tcp_port: "2181"
      listen_host: 0.0.0.0
      logger/console: "true"
      logger/level: trace
      prometheus/asynchronous_metrics: "true"
      prometheus/endpoint: /metrics
      prometheus/events: "true"
      prometheus/metrics: "true"
      prometheus/port: "7000"
      prometheus/status_info: "false"
  defaults:
    templates:
      dataVolumeClaimTemplate: data-volume
  templates:
    podTemplates:
    - name: default
      spec:
        containers:
        - image: mypvtrepo/clickhouse-keeper:24.3.13.40-alpine
          imagePullPolicy: IfNotPresent
          name: clickhouse-keeper
          resources:
            limits:
              cpu: 500m
              memory: 1Gi
            requests:
              cpu: 200m
              memory: 501Mi
        imagePullSecrets:
        - name: dockerhub
    volumeClaimTemplates:
    - name: data-volume
      reclaimPolicy: Retain
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        storageClassName: gp3
Slach commented

try explicitly link podTemplates to defailt.templates

spec:
 defaults:
    templates:
      podTemplate: default