[Guide] RedPanda
Paultagoras opened this issue · 3 comments
We should have a detailed guide for setting up the connector with a RedPanda environment.
I'm currently trying to use this connector in a Self-Hosted Redpanda Kubernetes Setup.
I'm installing the connector by extracting the released ZIP file within an init container that I added to the native Deployment/redpanda-connectors
.
Using this approach, Redpanda's Connectors workload recognizes the ClickHouse Connector:
But there are problems with missing classes when trying to configure it using Avro:
Class io.confluent.connect.avro.AvroConverter could not be found
When provided necessary JARs, I was able to instantiate the connector. But in runtime, I got:
org.apache.kafka.connect.errors.ConnectException: Instantiation error
at org.apache.kafka.connect.runtime.isolation.Plugins.newPlugin(Plugins.java:85)
at org.apache.kafka.connect.runtime.isolation.Plugins.newConverter(Plugins.java:327)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:622)
at org.apache.kafka.connect.runtime.Worker.startSinkTask(Worker.java:525)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:1800)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.lambda$getTaskStartingCallable$32(DistributedHerder.java:1850)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.NoClassDefFoundError: io/confluent/kafka/schemaregistry/client/SchemaRegistryClient
at java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
at java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3373)
at java.base/java.lang.Class.getConstructor0(Class.java:3578)
at java.base/java.lang.Class.getDeclaredConstructor(Class.java:2754)
at org.apache.kafka.common.utils.Utils.newInstance(Utils.java:396)
at org.apache.kafka.connect.runtime.isolation.Plugins.newPlugin(Plugins.java:83)
... 9 more
Caused by: java.lang.ClassNotFoundException: io.confluent.kafka.schemaregistry.client.SchemaRegistryClient
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:592)
at org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:136)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
... 15 more
Would you consider also publishing a fat jar into releases, that would contain all the needed classes for this to work without users needing to fetch multiple modules separately?
Would you consider also publishing a fat jar into releases, that would contain all the needed classes for this to work without users needing to fetch multiple modules separately?
We'll certainly consider it - we're not using something specific to Confluent though in the connector itself, that referenced jar file is meant to work with the Confluent Schema Registry.
What I've seen work before is to include something more like https://mvnrepository.com/artifact/io.confluent/kafka-avro-serializer/7.6.0 (though I forget the specific dependency) - either way, we'll discuss it internally!
Just to let you know, the JARs package that I've linked previously is all that you need apart from the pre-built release.
The problem with the exception I have shared with you was that the Confluent's JARs need to be added to the classpath. Solely putting them to the CONNECT_PLUGIN_PATH
wasn't enough.
If you wanted to experiment with setting up Redpanda Connectors in Kubernetes, here's my approach using Flux's HelmRelease's Post Renderers (to don't have to think about backporting my changes to new Redpanda releases by installing it through initContainer
):
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: redpanda
spec:
interval: 1h
url: https://charts.redpanda.com
---
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
name: redpanda
namespace: redpanda
spec:
chart:
spec:
chart: redpanda
interval: 12h
reconcileStrategy: ChartVersion
sourceRef:
kind: HelmRepository
name: redpanda
namespace: redpanda
version: '*'
interval: 30s
timeout: 10m0s
values:
# Values ref: https://github.com/redpanda-data/helm-charts/blob/main/charts/redpanda/values.yaml
connectors:
enabled: true
deployment:
replicas: 2
postRenderers:
- kustomize:
patches:
- patch: |
- op: add
path: /spec/template/spec/volumes/-
value:
- name: custom-connectors
emptyDir: {}
- op: add
path: /spec/template/spec/containers/0/volumeMounts/-
value:
- name: custom-connectors
mountPath: /custom-connectors
- op: add
path: /spec/template/spec/containers/0/env/-
value:
- name: CONNECT_PLUGIN_PATH
value: /custom-connectors
- op: add
path: /spec/template/spec/containers/0/env/-
value:
- name: CLASSPATH
value: /custom-connectors/*
- op: add
path: /spec/template/spec/initContainers
value:
- name: custom-connectors
image: alpine
env:
- name: CH_VERSION
value: v1.0.16
- name: AVRO_CONVERTER_VERSION
value: 7.6.0
command:
- sh
- '-c'
- |
cd /tmp
wget https://github.com/ClickHouse/clickhouse-kafka-connect/releases/download/${CH_VERSION}/clickhouse-kafka-connect-${CH_VERSION}.zip
unzip -j clickhouse-kafka-connect-${CH_VERSION}.zip '*.jar' -d /custom-connectors
wget https://d1i4a15mxbxib1.cloudfront.net/api/plugins/confluentinc/kafka-connect-avro-converter/versions/${AVRO_CONVERTER_VERSION}/confluentinc-kafka-connect-avro-converter-${AVRO_CONVERTER_VERSION}.zip
unzip -j confluentinc-kafka-connect-avro-converter-${AVRO_CONVERTER_VERSION}.zip '*.jar' -d /custom-connectors
volumeMounts:
- name: custom-connectors
mountPath: /custom-connectors
target:
kind: Deployment
name: redpanda-connectors
Alternative approach could be building a custom Docker image that would include all the necessary JARs.