serilog-contrib/Serilog.Enrichers.Sensitive

Exceptions are not masked

Opened this issue · 16 comments

This is not something that can be easily fixed in an enricher, but would require changes to the sinks being used.
Exceptions are immutable but might contain sensitive data anywhere in the message or stack trace.
It might be possible to recursively mask exceptions using reflection, but it doesn't sound like a good or safe solution.
What I ended up doing was to simply rewrite the sinks I use to have them mask the serialized LogEvents.

As an example, a database constraint might contain a value, e.g. an email address, which, when violated, isn't handled by this enricher.

Any ideas?

I’ve not implemented masking on exceptions pretty much by design.

Exceptions should never contain sensitive data, purely because they can be serialised into output or logs.

Additionally exceptions are very free form (within the boundaries of an exception) as they can contain custom properties, the Data collection can contain all sorts of things, there can be inner exceptions, lots of things.

Apart from the immutability of exceptions you can imagine that there are a ton of edge cases to deal with and it’s impossible to say they can ever be covered at all.

There is an option to say: “always mask exceptions” and have that operate as “remove the exception value from the LogEvent“, possibly replacing it with an exception with only the stack trace.

Let me know if that would be something.

Exceptions should never contain sensitive data, purely because they can be serialised into output or logs.

As I explained, that is just not always the case.
Anyone using this enricher might end up in the same situation as I did, where a SQL constraint violation contained personal information in the exception message. In my case, that was an email address, but it could also be something else.

I'm not really suggesting anything with this issue, I'm just raising it as a potential problem users should be aware of.

What I did was to implement custom sinks that would perform the masking instead of an enricher.
Serilog has no way of supporting this, unless exceptions are transformed into a different format before being passed on to the sink.

I did consider whether the enricher could replace the exception with something that did the masking, but that would lose a rather significant part of the log message.

That’s why I wrote should. The whole purpose of this enricher is to deal with situations where something sensitive is logged when it really shouldn’t have.

Still, the fact that the exception isn’t masked currently can be unexpected to users of this enricher and ideally should be addressed in some way.
Documentation is not that way because it’s non-obvious if you pull in this a package and use it. Therefore the enricher must provide a better way that deals with the assumption a user may have that anything sensitive is masked by default (I know, how would the enricher know what a password looks like? 🤷‍♂️)

Yeah, I understand. My case was related to emails only, as passwords and other potential sensitive information is, as you say, impossible to identify.

Ideally, Serilog should provide an exception abstraction which enrichers would have access to and transform, just like the other properties. I'm not aware of any work in that area and I doubt the scenario is important enough to implement.

It is also possible to use reflection, but I wouldn't go down that path as it's simply too risky modifying exceptions, especially from other libraries where the behavior can be unpredictable.

In my case it is about ensuring GDPR and not log potential PII, so I need to mask everything.
It's part of a larger framework where I'm not in control of the data being logged, so I need to ensure nothing obvious leaks to the destination.

I don't have any ideas on how to solve this, but if it's not documented as a minimum, it might bite some users of the library.

I might take a look at this at some point when I’ve got a bit of time to blow on it.

This has given me some thoughts about the approach of the enricher. Perhaps the way it works now, while it does work, is actually wrong and this should be either in a sink that acts like AggregateSink but in reverse.

I think it might be time to reach out to @nblumhardt and get his views on this

Interestingly, this was my conclusion as well after working on it for a while. Enrichment just doesn't seem to be the correct solution to this rather substantial problem.
Unfortunately there just isn't any abstraction of the LogEvent so there is no way to do this in a generic fashion. For instance, if you look at how the console sink works, it writes directly to Console.Out without any interception options.

For reference, we also use the Application Insights sink, but because of the same fundamental problems, we had to resort to reflection and intercept and mask the actual JSON payload sent to Application Insights.
This is far from ideal, and it would of course be preferred with a supported solution, but it works.

Thank you for investing time in identifying this problem, which I think only gets more important in the future.

Hi! Currently on limited time so please forgive the brevity of my reply. A wrapper sink can be used to achieve what you want, here, although there are a few moving parts.

The result should look like:

    .WriteTo.Masked(wt => wt.Console())

Internally, the wrapper sink would call ToString() on logEvent.Exception, if present, and in those cases construct a new LogEvent with a new exception type that carries the masked text and presents this on ToString().

There are some examples out there of both aspects; searching for TextException and LoggerSinkConfiguration.Wrap should get you close!

HTH,
Nick

Thanks 👍
It’ll take some experimenting, hopefully I can get around to doing that soon.

I'm a bit skeptical whether this will work properly for all sinks.
For the console sink, sure, as it's plain text, but the Application Insights sink constructs ExceptionTelemetry from the actual exception instance provided by the LogEvent.
If we start modifying the exception type, the telemetry could potentially be incorrect, inaccurate, have lost data, and in the worst case even rejected by Application Insights (which is very picky about what it accepts).

I could imagine other sinks might behave in a similar manner.

@siewers I've done some experimenting with masking exceptions and I think I've got something that should work for the majority of cases.
Have a look here and the related test cases here, here and here

It's by no means complete but it shows how it could be done.

@nblumhardt I've made some headway with changing from an enricher to a sink based on your suggestion of using Wrap(). Initialisation now looks like this:

var logger = new LoggerConfiguration()
    .WriteTo.Masked(options =>
        {
            options.Mode = MaskingMode.Globally;
            options.MaskingOperators = new List<IMaskingOperator> { new EmailAddressMaskingOperator() };
        },
        writeTo =>
        {
            // Add sinks here instead of directly on LoggerConfiguration.WriteTo
            writeTo.Console();
            writeTo.Debug();
            writeTo.Sink(inMemorySink);
        })
    .Enrich.FromLogContext()
    .CreateLogger();

and that seems to work like expected.
My initial experiment was using reflection to replace the root aggregate sink on the constructed logger but I found that a bit too dirty and it would be very brittle.

Something that I did notice is that when I do this:

var logger = new LoggerConfiguration()
    .WriteTo.Masked(options =>
        {
            options.Mode = MaskingMode.Globally;
            options.MaskingOperators = new List<IMaskingOperator> { new EmailAddressMaskingOperator() };
        },
        writeTo =>
        {
            // Add sinks here instead of directly on LoggerConfiguration.WriteTo
            writeTo.Sink(inMemorySink);
        })
    .WriteTo.Console();
    .WriteTo.Debug();
    .Enrich.FromLogContext()
    .CreateLogger();

then the log entries going to Console and Debug are also masked. I mean, that's great and what I hoped for but I found it a bit unexpected. Does sequence of registering sinks matter here?

It looks nice, but unfortunately, it makes the assumption that all exceptions are serializable.
This is not always the case, and not even Microsoft is consistent in implementing serialization on exceptions in .NET, probably because of the deserialization vulnerability in the binary formatter.
It would be quite unfortunate to lose exception details at "random", because the exception didn't have a proper deserialization constructor.

One option could be to detect whether the exception does have any sensitive information and in that case, determine what to do. It might be sufficient to simply omit the exception if it doesn't have a deserialization constructor?

Ultimately all exceptions are serializable through the base implementation on Exception so even if the implementor did not choose to add any additional properties of the exception to the serialization logic then at least the basic information such as stack trace etc are preserved.

any updates on the approach this this?

I've been using this library and happened to stumble across some exceptions from asp.net core identity which includes email address in the exception message :) Not ideal

@alhardy to be honest I was waiting on some more feedback and haven’t circled back to this because I kinda forgot about it 🤷‍♂️

I might merge this in and add a configuration option to enable it so behaviour can be tweaked

Also, I do need to look at exception types that aren’t serializable because they might not be handled correctly currently

A follow-up from my side:

Because we track exceptions to Application Insights and are under GDRP regulations, we ended up creating a DelegatingHandler that intercepts all traffic being sent to Application Insights. Because the data being sent is JSON, we then created a custom JsonTextWriter that masks all values in the JSON.
It's a bit of a hack because the Application Insights library doesn't allow this natively, so we replaced the internal handler by wrapping it in our own DelegatingHandler through reflection.

It works although we are at the mercy of correctly detecting potentially sensitive data (which in our case is "just" email addresses).

We also created a custom Slack sink that uses a similar approach when writing the JSON to the HttpClient.

This is of course not scalable in all scenarios, but it is possible to adapt the logic to create different message types.

I hope this helps.