Extract client identifier from authorization token

Question

Extract client identifier from authorization token

whiskeysierra opened this issue 6 years ago · 24 comments

whiskeysierra commented 6 years ago

Detailed Description

Logbook should enrich requests with the subject from the JWT token in the Authorization header, if present.

Context

Having the id as part of the requests would make it way easier to identify clients which in turn helps when:

identifying unauthorized access issues
usage analysis

Possible Implementation

introduce the concept of an attribute
attributes are simple key-value pairs (tbd type of value)
a request/response can have multiple attributes
attributes should be derived/created from requests/responses before any filtering (e.g. obfuscation)
built-in attribute extractor for sub from JWT token
- detect JWT tokens: Bearer prefix + 3x base64 data separated by dots
- remove Bearer prefix
- split at .
- base64 decode payload, i.e. the second element
- parse JSON
- read properties in order and return the first one that is present
  - https://identity.zalando.com/managed-id (Zalando employee tokens)
  - sub
- don't hard code priorities, but rather allow to configure a list of names, defaults to ["sub"]
extend JsonHttpLogFormatter to include attributes (tbd, top level? nested? name clashes?)

Employee Token

{
  "sub": "3b66d47c-d886-4c63-a0b9-9ec3cad7e848",
  "https://identity.zalando.com/realm": "users",
  "https://identity.zalando.com/token": "Bearer",
  "https://identity.zalando.com/managed-id": "wschoenborn",
  "azp": "ztoken",
  "https://identity.zalando.com/bp": "810d1d00-4312-43e5-bd31-d8373fdd24c7",
  "auth_time": 1540188140,
  "iss": "https://identity.zalando.com",
  "exp": 1541411248,
  "iat": 1541407638
}

Service Token

{
  "sub": "stups_sales-order-service",
  "https://identity.zalando.com/realm": "services",
  "https://identity.zalando.com/token": "Bearer",
  "azp": "stups_sales-order-service_389e4e16-0695-45df-9afd-d9be0ffab456",
  "https://identity.zalando.com/bp": "810d1d00-4312-43e5-bd31-d8373fdd24c7",
  "iss": "https://identity.zalando.com",
  "exp": 1541411315,
  "iat": 1541407705,
  "https://identity.zalando.com/privileges": [
    "com.zalando::loyalty_point_account.read_all"
  ]
}

Links

Your Environment

Version used: 1.11.1

Answer 1 · 2018-11-05T10:54:34.000Z

This should probably be configurable and default to being disabled, the reason being that the subject often contains users email address, which you do not want to log for data protection reasons.

Answer 2 · 2018-11-05T10:57:17.000Z

Good point!

Answer 3 · 2018-11-06T11:41:05.000Z

See also #373

Answer 4 · 2019-03-04T10:12:56.000Z

Alternative could be to integrate with spring security and log SecurityContextHolder.getContext().getAuthentication().getName()
to keep it auth-method agnostic.

Answer 5 · 2019-03-04T10:21:55.000Z

@AlexanderYastrebov That requires a spring dependency.

Answer 6 · 2019-03-06T16:30:03.000Z

We're extracting data from the JWT to the MDC already, but after validation of the token; would not want to log any of the content before that. So some kind of hook for validation would be desirable.

Answer 7 · 2019-03-06T16:30:46.000Z

Which JWT parsers are you considering? See https://github.com/skjolber/java-jwt-benchmark for a few.

Answer 8 · 2019-03-06T16:34:53.000Z

For Spring my experience is that it is limited how much information there is to be found about authorization at request-response-logging-time, since that depends on the implementation which is called later up the chain. For example within the same app some REST methods do not require authorization while others do, and this is enforced by @PreAuthorize and/or even Open Policy Agent rules within the RestControllers.

It is perhaps better to add @ControllerAdvice for AccessDeniedException and friends with extra logging (including dumping contents of JWT) if there is a violation.

Answer 9 · 2019-03-09T20:22:59.000Z

but after validation of the token; would not want to log any of the content before that

Why not?

Which JWT parsers are you considering?

Tbh, my idea was to just do it by hand using the Java standard library (split + base64) and Jackson (json).

Answer 10 · 2019-03-11T12:21:50.000Z

Well logging details about the JWT goes into the same boat as logging request bodies before authenticaiton/autorization check. But I guess as long as there is not misunderstandings, it is fair to log those details. We're logging JWT details in the MDC, my first impression was that there was potential for mixing up verified and unverified credentails, but technically those should live in seperate parts of the log statements.

Answer 11 · 2019-03-11T12:24:24.000Z

I guess 'by hand' parsing is fair if there is no validation involved.

Answer 12 · 2019-03-11T12:51:17.000Z

I guess 'by hand' parsing is fair if there is no validation involved.

I believe it's even beneficial to log client identifiers especially for unauthorized requests since that may give you an indiciator who to talk to (assuming no shady intention).

Answer 13 · 2019-11-22T16:25:24.000Z

So what does the desired output from logging look like? I'm not so sure about these attributes, would it not be more simple to just transform the header presentation, i.e. in the HttpLogFormatter?

Answer 14 · 2019-11-22T16:32:42.000Z

I'm not so sure about these attributes, would it not be more simple to just transform the header presentation, i.e. in the HttpLogFormatter?

We obfuscate the Authorization header, so in the formatter there wouldn't be any way to do that.

So what does the desired output from logging look like?

{
  "origin": "remote",
  "type": "request",
  "correlation": "2d66e4bc-9a0d-11e5-a84c-1f39510f0d6b",
  "protocol": "HTTP/1.1",
  "sender": "127.0.0.1",
  "method": "GET",
  "path": "http://example.org/test",
  "headers": {
    "Accept": ["application/json"],
    "Content-Type": ["text/plain"]
  },
  "attributes": {
    "subject": "willi.schoenborn@zalando.de"
  },
  "body": "Hello world!"
}

Answer 15 · 2019-11-22T17:01:26.000Z

Looking at some of the headers we're using, a lot of them actually contain structured data, like

X-Shopify-Shop-Api-Call-Limit: 1/80
Strict-Transport-Security: max-age=7889238
Set-Cookie: BIGipServerpool_posten_api.x.com_7460=1622359.9345.0300; path=/; Httponly; Secure

i.e.

{
  "origin": "remote",
  "type": "request",
  "correlation": "2d66e4bc-9a0d-11e5-a84c-1f39510f0d6b",
  "protocol": "HTTP/1.1",
  "sender": "127.0.0.1",
  "method": "GET",
  "path": "http://example.org/test",
  "headers": {
    "Accept": ["application/json"],
    "Content-Type": ["text/plain"],
    "X-Shopify-Shop-Api-Call-Limit": {
       value: 1, 
       limit: 80
    }
  },
  "attributes": {
    "subject": "willi.schoenborn@zalando.de"
  },
  "body": "Hello world!"
}

Would it be possible to 'capture' some of that, possibly also transforming authorization to

{
  "origin": "remote",
  "type": "request",
  "correlation": "2d66e4bc-9a0d-11e5-a84c-1f39510f0d6b",
  "protocol": "HTTP/1.1",
  "sender": "127.0.0.1",
  "method": "GET",
  "path": "http://example.org/test",
  "headers": {
    "Accept": ["application/json"],
    "Content-Type": ["text/plain"],
    "Authorization": {
       "sub": "stups_sales-order-service", 
       "iss": "https://identity.zalando.com"
    }
  }
  "body": "Hello world!"
}

if so desired?

Answer 16 · 2019-12-03T09:34:03.000Z

I guess transforming the Authorization header, rather than filtering it, would be not reveal confidential information. Also, was it ever considered to just remove (filter) the token signature instead of the whole value? At least that would prevent someone taking a token from the logs.

Answer 17 · 2019-12-03T09:39:09.000Z

The subject is already confidential, see #381 (comment)

Answer 18 · 2019-12-03T10:21:48.000Z

But that is (application-specific-) misuse of the Subject. Logging 'who did what' becomes impossible without the subject?

Answer 19 · 2019-12-03T13:01:04.000Z

Also, was it ever considered to just remove (filter) the token signature instead of the whole value?

That sounded like a proposal to change the current behavior of obfuscating the Authorization header completely. It would, again by default, expose subjects which is not ideal.

Logging 'who did what' becomes impossible without the subject?

That's totally desired, but it should be opt-in, so users need to make a conscious decision whether to use it or not. I want to be secure by default.

Might be that I misinterpreted your second to last comment.

Answer 20 · 2019-12-03T13:04:33.000Z

I agree opt-in and not changing current default behaviour is desired.

Answer 21 · 2019-12-03T13:12:47.000Z

So for an opt-in solution, what do you think about a structured header output / transformation approach (like my comment with JSON example above)?

Answer 22 · 2020-02-17T21:20:14.000Z

So for an opt-in solution, what do you think about a structured header output / transformation approach (like my comment with JSON example above)?

I believe that's an orthogonal concern and would deserve its own issue/discussion.

Answer 23 · 2022-01-04T10:32:59.000Z

Hi everybody,

I am looking for this feature. Is there a way to log the subject only?

Answer 24 · 2023-09-20T15:55:24.000Z

Addressed by #1589.