kiwicom/structlog-sentry

Duplicate logs in Sentry

Closed this issue ยท 10 comments

I followed in installation instructions, now I'm getting Sentry errors duplicated:

  • One by structlog-sentry with proper message
  • One by sentry with the whole event dict as message

I'm missing something in configuration.

Sentry SDK config:

sentry_sdk.init(
        dsn=env.str('SENTRY_DSN'),
        integrations=[DjangoIntegration(), CeleryIntegration()],
        auto_session_tracking=False,
        traces_sample_rate=0.01,
        send_default_pii=True,
        attach_stacktrace=True,
        request_bodies='medium',
        release='1.0.2',
        environment='dev',
    )

1st one (Correct)
image

2nd one (Seems to be log catching from Sentry)
image

Hi, I think you need to turn off the LoggingIntegration to get rid of duplications by adding LoggingIntegration(event_level=None, level=None) to the list of integrations.

Please read: https://github.com/kiwicom/structlog-sentry#logging-as-json

So this could fix it:

sentry_sdk.init(
        dsn=env.str('SENTRY_DSN'),
        integrations=[
            DjangoIntegration(),
            CeleryIntegration(),
            LoggingIntegration(event_level=None, level=None),
        ],
        auto_session_tracking=False,
        traces_sample_rate=0.01,
        send_default_pii=True,
        attach_stacktrace=True,
        request_bodies='medium',
        release='1.0.2',
        environment='dev',
    )

Thanks for the reply @paveldedik

do you think this will disable Error logs coming from third party libraries?
Because logs using the std logging will not go through the structlog processor and could be lost.

If you use the following setup for structlog, it should work even for logs coming from third party libraries: https://www.structlog.org/en/stable/standard-library.html#rendering-using-structlog-based-formatters-within-logging (this is the setup that I use)

Otherwise, you will have to manually either capture them with sentry or log them with structlog.

I haven't done that much testing though, I am open to suggestions on how to resolve the duplication of events by both the LoggingIntegration and structlog-sentry. AFAIK Sentry is doing some kind of deduplication of events, but it doesn't work here since the output from JSONRenderer is different than the message captured by StructlogSentryProcessor. And with LoggingIntegration both is being captured.

I will play with it in the next days and let you know. Thanks again for your support and time :)

I can confirm that the proposed approach of using LoggingIntegration will fix the issue for logs generated using structlog but will mute all error/critical errors from standard library and third party packages.

A workaround or solution I came with is to use ignore_logger and ignore loggers using structlog as those will go through the processors and structlog-sentry will handle it properly.

Example for a Django app:

(ignore_logger(x) for x in OWN_APPS)

I'm not sure if this is worth mentioning in the README maybe it depends on the way I setup structlog.

That wasn't a proper fix, because structlog-sentry uses that too and therefore it will not trigger any event at all.
The workaround I came with is to have a custom LoggingIntegration with it's own ignored loggers list which will force the Sentry SDK/Client to ignore them.

from logging import LogRecord

from sentry_sdk.integrations.logging import LoggingIntegration

_IGNORED_LOGGERS = set()


def ignore_logger(logger_name: str) -> None:
    _IGNORED_LOGGERS.add(logger_name)


class CustomLoggingIntegration(LoggingIntegration):
    def _handle_record(self, record: LogRecord) -> None:
        # This match upper logger names, e.g. "celery" will match "celery.worker"
        # or "celery.worker.job"
        if record.name in _IGNORED_LOGGERS or record.name.split(".")[0] in _IGNORED_LOGGERS:
            return
        super()._handle_record(record)

Sorry for the late reply.

I'm not sure if this is worth mentioning in the README maybe it depends on the way I setup structlog.

I think it is. If you want you can send a README update for the workaround you found. I still think it wouldn't be necessary if you used this approach https://www.structlog.org/en/stable/standard-library.html#rendering-using-structlog-based-formatters-within-logging

I'm not sure what to write to readme though, we can mention this workaround, or we can even add the CustomerLoggingIntegration you implemented to the library's code. I have to test it too, but you did a good job on your part, thank you :-)

jws commented

@paveldedik @mounirmesselmeni what about the following to:

  • prevent duplicate sentry errors.
  • maintain breadcrumbs.
  • allow for sentry to pick up logging errors from non-structlog loggers.
# use verbose=True to tag records with whether sentry events were emitted by SentryProcessor
structlog_sentry.SentryProcessor(level=logging.CRITICAL, event_level=logging.ERROR, verbose=True)

class SentryProcessorFilter(logging.Filter):
    def filter(self, record: logging.LogRecord) -> bool:
        if isinstance(record.msg, dict):
            # if SentryProcessor tagged as already sent to sentry, filter this record
            return record.msg.get("sentry") != "sent"
        return True

sentry_logging_integration = sentry_sdk.integrations.logging.LoggingIntegration(level=logging.INFO, event_level=logging.ERROR)
sentry_logging_integration._handler.addFilter(SentryProcessorFilter())

sentry_sdk.init(integrations=[sentry_logging_integration])