laserdisc-io/fs2-aws

Remove DynamoDB streaming support

barryoneill opened this issue · 5 comments

In short, the current DynamoDB streaming support in this project is based on the V1 SDK. Unless a replacement is written, the existence of this support making upgrading dependencies in this project difficult.

In particular, the function parseDynamoRecord exists to support parsing the records produced by DynamoDB.

Problem

  • this client is based on the AWS SDK v1 library
  • parseDynamoRecord function accepts a payload which contains com.amazonaws.services.dynamodbv2.model.AttributeValue attributes (sdk V1)
  • it then uses Scanamo's DynamoObject code to map these to Json
  • Scanamo (via scanamo-circe) cannot be upgraded, as it is now using SDK v2 classes (software.amazon.awssdk.services.dynamodb.model.AttributeValue)

Proposal 1

We shouldn't be bringing in scanamo or scanamo-circe into fs2-aws in the first place, so I propose that parseDynamoRecord and the example that references it be dropped.

Proposal 2

All code relating to event streaming in DynamoDB should be dropped so that com.amazonaws (v1 SDK) is no longer a dependency of this project.

  • Event streaming applications should be the recently added Kinesis streaming functionality top capture event streaming changes from Dynamodb, which would use the more robust fs2-aws-kinesis module
  • Those applications which use native dynamodb streaming (e.g. lambda consumers) should look at the aws-lambda-java-events-sdk-transformer for code which converts the payload.

i prefer something like option 1.
use com.amazonaws.services.dynamodbv2.model.AttributeValue as the output of the stream and let the user choose how to parse this.

What about removing the streaming support and adding support for parsing dynamo events to the fs2-aws-kinesis module. We have implemented this functionality at work, it uses magnolia for automatic typeclass derivation. If you like this solution, I think I can contribute to this project.

Streaming from DDB and Kinesis (DDB also supports this) are two different ways of streaming DDB data and do not have a similar impact.
The user scales Kinesis; in other words, the DDB data change events are not connected to the number of shards in your stream. So it would be best if you were smart on huge loads on how to scale your Kinesis stream. Even though you might not experience this in your use case, some users may do
DDB streams, on the other hand, are scaled according to the DDB table partitions
I would love to drop DDB streams support because of this V1-V2 AWS SDK saga but I prefer to keep this for this niche of users

@barryoneill I removed scnamo from the ddb streaming. Can we close this one?

Nice one @semenodm - closing!