Spotlight's 100% tracer sampler interferes with health checks

Question

Spotlight's 100% tracer sampler interferes with health checks

Closed this issue 2 months ago · 6 comments

rassie commented 3 months ago

How do you use Sentry?

Self-hosted/on-premise

Version

2.35.0

Steps to Reproduce

I'm writing this in full awareness of XKCD 1172...

In my environment, we are developing our services using a local Kubernetes cluster. The deployment configuration is as close to production deployment as it can be, so that we employ Kubernetes health checks on our services. We are using logging filters and Sentry samplers to filter out requests going to /health from logs and traces.

Now we want to expand our Sentry experience with Spotlight, Apparently, in #4207 a change has been introduced to sample at 100% for everything with Spotlight in development. Maybe our configuration is unadjusted, but in our case, this leads to /health traces overwhelming the Spotlight overlay and also the logs if set to DEBUG.

Add a primitive endpoint to your application
Add a tracing sampler to your Sentry SDK configuration filtering that endpoint
Enable debug on your Sentry SDK
Add Spotlight to the configuration
Call the endpoint multiple times and observe both logs and overlay getting immer fuller

Expected Result

I expect Spotlight to work exactly like Sentry would, i.e. respecting sampler configuration. It's perfectly possible my expectation is wrong, I'm ready to add code to my services to make it work like I expect it to. From what I can see, it's currently not possible. For example, setting a DSN results in SDK trying to parse it or trying to connect to it, so that a placeholder DSN won't work and I don't want to use a real DSN in development. Leaving the DSN out results in the described behaviour.

Actual Result

Logs and overlay overflow with useless data.

Answer 1 · 2025-08-18T16:02:54.000Z

PY-1813 Spotlight's 100% tracer sampler interferes with health checks

Answer 2 · 2025-08-25T07:16:39.000Z

Hey @rassie, thanks for raising. @BYK Can you take a look? 🙏🏻

Answer 3 · 2025-08-26T13:08:56.000Z

Hi @rassie, thanks a lot for the detailed issue and also using Spotlight :)

I think Spotlight getting overwhelmed should be a separate issue over its own repo: getsentry/spotlight#912 -- feel free to add more details and follow there.

Regarding the sampling rate override, your assumption about Spotlight behaving exactly like Sentry unfortunately does not hold :) Sampling rate only makes sense when you deploy your app in a distributed fashion (in terms of users not necessarily multiple nodes). Locally, if you set to 1% sample rate you'd only get a random 1% of your local transactions which is very unlikely to be helpful. I can offer you 2 workarounds in the meantime:

Set your DSN to http://spotlight@localhost:8969/0 and don't use spotlight=true. This is an undocumented hack, ref getsentry/spotlight#475
Add a before_send_transaction hook and filter out the healthcheck: https://docs.sentry.io/platforms/python/configuration/filtering/#using-before-send-transaction (the example there is also about excluting health checks)

Not closing the issue yet as I'm still open to hearing about arguments against turning up the traces sampling rate automatically when Spotlight is turned on and no DSN exists.

One argument we can make is to not override traces_sampler if it is already set as this can be prod/debug aware and remove the need of before_send_transaction. I can look into that but it would "leak" some logic in our own Sentry setup for instance.

Answer 4 · 2025-08-26T13:15:27.000Z

Sampling rate only makes sense when you deploy your app in a distributed fashion (in terms of users not necessarily multiple nodes). Locally, if you set to 1% sample rate you'd only get a random 1% of your local transactions which is very unlikely to be helpful.

I think we can mostly agree on this -- I'm not really using sample rates between 0.0 and 1.0, it's more of an ON/OFF switch for me, OFF for healthchecks, ON for everything else. I'll look into using before_send_transaction, could be a good solution, I don't really care where I filter. In general there might be a middle ground solution like making traces_sampler behave identically in every environment and just clamp the returned values in DEV to 0.0 or 1.0.

Answer 5 · 2025-09-03T21:52:52.000Z

@rassie just following up on this to make sure you are not waiting on us to do anything (or if you do, clarify the next step 🙂 )

I'll look into using before_send_transaction, could be a good solution, I don't really care where I filter. In general there might be a middle ground solution like making traces_sampler behave identically in every environment and just clamp the returned values in DEV to 0.0 or 1.0.

Were you able to use before_send_transaction and if yes, was that a good experience?

Answer 6 · 2025-09-15T07:37:35.000Z

I'll close this since there's been no response for some time and there's a dedicated Spotlight issue now -- please follow up there.