Altinity/clickhouse-operator

ClickHouseKeeper example does not work

Closed this issue · 15 comments

I'm unable to use the https://github.com/Altinity/clickhouse-operator/blob/master/docs/chk-examples/02-extended-3-nodes.yaml example to produce a working ClickHouse Keeper cluster.

The issue's I've bumped into:

  • no dataVolumeClaimTemplate specified fixed here 0374837
  • when clickhouse keeper starts on each of the pods, it expects to find an SSL cert
    Poco::Exception. Code: 1000, e.code() = 0, SSL context exception: Error loading private key from file /etc/clickhouse-keeper/server.key: error:80000002:system library::No such file or directory (version 24.10.1.2812 (official build))
    
  • When the ClickhouseKeeperInstallation resource is created, two (of the three) stateful sets are created at the same time. Once they are both "ready", the 3rd is created. When the 3rd is created, the PVC bound / attached to the 1st stateful set of the keeper goes into a "Terminating" state (and stays that way indefinitely)
    • I can see a Delete PVC: ... log (from worker-deleter.go:114 of operator v0.24.0) indicating the the operator is trying to delete the PVC
    • my volumeClaimTemplate has reclaimPolicy: Retain
  • If the ClickhouseKeeperInstallation is deleted, the PVC that was in a "Terminating" state is also deleted; however, the other two remain in place (and stay "Bound")

Also note:

  • podAntiAffinity is set up with app label, but the pod template doesn't include such a label (doesn't prevent the keeper from working, but it's incorrect)

BTW I found that the issue with the PVC getting set to Terminating can be fixed by using provisioner: Operator.

Do you have any documentation around why the StatefulSet is the default provisioner, and when it should / should not be used?

Slach commented

when clickhouse keeper starts on each of the pods, it expects to find an SSL cert

@alex-zaitsev root reason is

/etc/clickhouse-operator/chk/keeper_config.d/01-keeper-01-default-config.xml for clickhouse-operator config, which contains <openSSL> section
will fix it

Slach commented

When the ClickhouseKeeperInstallation resource is created, two (of the three) stateful sets are created at the same time.

@janeklb could you check this behavior in your environment in 0.24.1 ?

Slach commented

podAntiAffinity is set up with app label, but the pod template doesn't include such a label (doesn't prevent the keeper from working, but it's incorrect)

@janeklb fixed in
0a4ebb1

thanks for reporting

When the ClickhouseKeeperInstallation resource is created, two (of the three) stateful sets are created at the same time.

@janeklb could you check this behavior in your environment in 0.24.1 ?

I will give this a try tomorrow and let you know.
For my own curiosity, could you please point at the commit(s) that you think might have addressed this problem?

When the ClickhouseKeeperInstallation resource is created, two (of the three) stateful sets are created at the same time.

@janeklb could you check this behavior in your environment in 0.24.1 ?

I will give this a try tomorrow and let you know. For my own curiosity, could you please point at the commit(s) that you think might have addressed this problem?

Have you released 0.24.1? I don't see a tag for it, and so I'm not able to install the helm chart.

Slach commented
git checkout https://github.com/Altinity/clickhouse-operator.git
cd clickhouse-operator
git fetch
git checkout 0.24.1
OPERATOR_NAMESPACE=your-namespace ./deploy/operator/clickhouse-operator-install.sh
git checkout https://github.com/Altinity/clickhouse-operator.git
cd clickhouse-operator
git fetch
git checkout 0.24.1
OPERATOR_NAMESPACE=your-namespace ./deploy/operator/clickhouse-operator-install.sh

We use flux and and a helm release, so this is not something i want to do in our system atm. Can you create a temporary tag, at least? maybe 0.24.1rc-1 or something?

Or i guess it would be release-0.24.1rc-1

Slach commented

@janeklb

git checkout https://github.com/Altinity/clickhouse-operator.git
cd clickhouse-operator
git fetch
git checkout 0.24.1
helm install -n <your-namespace> <your-release-name> ./deploy/helm/clickhouse-operator/

Ok -- I can confirm that using 0.24.1 allows you to re-use PVCs without specifying provisioner: Operator on the CHK installation (whereas before if you made a change to the CHK installation, new PVCs would be created for each pod / old ones would try to be deleted)

Can you please describe why one might want to use Operator as a provisioner instead of the default StatefulSet?

@janeklb thanks for reporting, @Slach thanks for fixing, I'm new to Clickhouse/Altinity and I have been learning for the past few days, today I finally started on my K8s Dev cluster and ran into this issue.

Perfect timing ;), tried 0.24.1 and it's working fine.

Regards,
Victor

Slach commented

Can you please describe why one might want to use Operator as a provisioner instead of the default StatefulSet?

It just more flexible
for example, it allows PVC resize without re-create pods in EKS

Thanks @Slach. Are the additional features provided by using the operator as the provisioner documented anywhere?