airbnb/streamalert

Classifier Lambda not able to recognize kinesis aggregated logs

vynu opened this issue · 0 comments

vynu commented

Background

kinesis aggregation : https://github.com/awslabs/amazon-kinesis-producer/blob/master/aggregation-format.md

used to efficient puts into kinesis data stream to avoid throttling

Description

classifier lambda exiting with errors like invalid JSON , after careful observation found the problem with aggregated records from kinesis.

KPL uses Google protocol buffers (protobuf) to create a binary file format for this.
NOTE: The Amazon Kinesis Client Library (KCL) implements deaggregation based on this format on the consumer side.

base64 decryption went wrong for classifier because of protobuf

Steps to Reproduce

feed classifier with KPL generated logs

Desired Change

de-aggregation solution:
https://github.com/awslabs/kinesis-aggregation/tree/master/python

pip install aws_kinesis_agg can be used to de-aggregate

High level overview of the desired change or outcome.