aws/aws-xray-sdk-node

S3 event notifications not propagating trace headers to SQS destination

Opened this issue · 8 comments

I am working on an application where we want to integrate X-Ray traces.

The flow of services looks like:
API Gateway -> Lambda -> S3 -> SQS -> Lambda -> API Gateway

X-Ray tracing from API Gateway to Lambda works fine.

to integrate S3 I instrumented the Node.js code by doing:

s3 =  AWSXRay.captureAWSClient(new S3({ apiVersion: '2006-03-01' }));

S3 now appears in the CloudWatch ServiceLens trace details.

The same S3 bucket above is configured to send S3 event notifications to SQS. According the X-Ray docs S3 passes down tracing headers to downstream services such as SQS. However this appears not to be working as SQS does not appear in the trace details, there is nothing beyond S3 in the trace details. I also could not see any tracing details in the S3 object metadata and SQS message attributes which is where I expect tracing data to be propagated to, although this is an assumption and not mentioned in the doc.

Hey @afayes ,

Based on the documentation, S3 will propagate the X-Ray trace header through http header. Can you try parsing the X-Amzn-Trace-Id on SQS and manually propagating it to downstream services to see if it works?

@lupengamzn I agree that based on the doc S3 will propagate the X-Ray trace header to SQS.

I cannot manually parse the X-Amzn-Trace-Id on SQS because the SQS is configured as an S3 event notification destination and there is no way to intercept the flow AFAIK unless I put a Lambda in between which is not an option in the project.

Could it be that the SQS queues were created before the functionality was available for S3 to SQS X-Ray integration?

Another question is where can I see the trace headers in the S3 object, is it supposed to be stored as message attributes which is not happenning?

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs in next 7 days. Thank you for your contributions.

Has there been any investigation on this?

Hi @afayes,

The reason you couldn't see the downstream SQS queue is because SQS and S3 are only integrated with X-Ray to propagate trace context through their systems, but neither system actually emits segment data. So you could see the S3 node because it's emitted from the upstream lambda function as a subsegment, but because neither the S3 nor SQS services could emit any data to represent the SQS queue, it could not show up on your service map.

To add on to what @willarmiros said, X-Ray is able to infer the S3 node because Lambda (which is actively sending traces to X-Ray) knows that it called S3 as the next downstream service, but it does not know that S3 then calls SQS so it cannot infer that.

It's not until you're back at API Gateway that you are again actively sending traces to X-Ray and a segment is sent to X-Ray which can be connected to the previously inferred S3 node.

According to X-ray docs for S3 S3 integrates with SQS downstream. To quote the docs:

If a service traces requests by using the X-Ray SDK, Amazon S3 can send the tracing headers to downstream event subscribers such as AWS Lambda, Amazon SQS

To quote the docs again

With the Amazon S3 notification feature, you receive notifications when certain events happen in your bucket. These notifications can then be propagated to the following destinations within your application:

Amazon Simple Notification Service (Amazon SNS)

Amazon Simple Queue Service (Amazon SQS)

Based on this one would expect that tracing from S3 event notifications to be integrated with SQS. It is also not possible to intercept the the flow between S3 event notifications and SQS with manual logic so we can't manually add the integration. X-Ray looks like it is not fit for production use as it does not support common use cases. Tracing is one of those things that have to work across all services to be useful. Having some support for services here and there is not useful. Using third party tracing tools like Zipkin isn't possible either since we can't intercept the flow between certain service interactions.

That is a fair callout @afayes, I've updated the documentation to be more clear. The context is propagated from S3 -> SQS to be read by a consumer on the other end, but it is disappointing that SQS cannot yet be seen on the trace.