Aiven-Open/transforms-for-apache-kafka-connect

Add Hash SMT

Closed this issue ยท 10 comments

Often it's useful to be able to replace some string value with its hash in order to, for example, anonymize the record. We need to add Hash SMT for this.

Here's a scratch of the documentation for it that specifies it:

Hash

This transformation replaces a string value with its hash.

The transformation can hash either the whole key or value (in this case, it must have STRING type) or a field in them (in this case, it must have STRUCT type and the field's value must be STRING).

Exists in two variants:

  • io.aiven.kafka.connect.transforms.Hash$Key - works on keys;
  • io.aiven.kafka.connect.transforms.Hash$Value - works on values.

The transformation defines the following configurations:

  • field.name - The name of the field which value should be hashed. If null or empty, the entire key or value is used (and assumed to be a string). By default is null.
  • function - The name of the hash function to use. The supported values are: md5, sha1.

Here is an example of this transformation configuration:

transforms=HashEmail
transforms.HashEmail.type=io.aiven.kafka.connect.transforms.Hash$Value
transforms.HashEmail.field.name=email
transforms.HashEmail.function=sha1

Iโ€™d be interested in contributing this if no one is working it.

@brbrown25 Hi Brandon
Your contribution would be most welcome! Please go for it.

Awesome Iโ€™ll start taking a crack at this today!

@brbrown25 any progress on this? Let me know if you don't have time right now, so maybe i could help out as well in the coming weeks

@nick-zh i just have to write the unit tests and hope to have sometime later today to knock that out!

@brbrown25 during an online session with Confluent i mentioned this SMT (since it would be nice if Confluent Cloud would offer SMTs in the future as well) and their input was, this would be a great SMT for the Kafka project itself.
If you find time, i think it would be nice to create a PR to add it here:
https://github.com/apache/kafka/tree/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms

@nick-zh I will definitely submit that. Should have some free time this morning to do so!

@brbrown25 nice ๐ŸŽ‰