tuplejump/kafka-connect-cassandra

DataConverter should be configurable so as to allow conversion of any type of ValueSchema to CQL query

aaruna opened this issue · 4 comments

Right now, DataConverter.sinkRecordToQuery(r, configProperties) assumes one-to-one mapping between fields in ValuesSchema of SinkRecord to the columns in the CQL table.

I saw this need when I am trying to use kafka-connect-twitter which uses a particular schema and the Sink table might have a different schema.

@helena Is this a valid use-case?

Hi @aaruna, the PR I'm working on in another branch which I believe resolves this actually.
My code changes create the table schema on start up from Cassandra and parse based on that, not on the fields from Kafka. Great question and insight :) I will add you to my PR so you can check that too.

Shiti commented

@helena I discussed this with @aaruna yesterday, we agreed that having an optional map object with the component name as key and corresponding cassandra column name as value would be useful.

@helena I have started working on that (taking the approach @Shiti mentioned in the previous comment) as I need it to use kafka-connect-twitter as source.
Is there any branch I can look at to see if your approach solves this problem?

I have raised a PR #18 fixing this issue.
It is a WIP. I still need to add few more tests and fix a scalastyle issue.