oauth-wg/oauth-transaction-tokens

Deployment models for the Transaction Token Service

Opened this issue · 4 comments

This issue is to track different deployment models for the Transaction Token Service

  • embedded with the AS
  • embedded with an API GW
  • standalone service
  • ???

Embedded with the Authorization Server

In this model, the token exchange endpoint is exposed as part of the standard Authorization Server exposed endpoints.
Pros:

  • the iss claim is the same as the authorization server and likely already known by downstream workloads
  • the AS metadata can already be discovered from it's iss claim and exposes the JWKS URI
  • centralizes authorization policy management into a single service (policies for issuing tokens [both access and transaction] is in one place)

Cons:

  • potentially exposes the ability to obtain a transaction token outside the bounds of the trust domain (e.g. exposed externally to the internet)
  • difficult to distribute the transaction token issuing capability as the AS is not likely geographically distributed

Notes from call on 06-14-2024

  • In a small deployment, the TTS could be a part of the AS, but for multi-region deployments, it could be distributed
  • Does thinking through this bring any new requirements to the spec
  • (Justin) We should think through this to make sure we are not over optimizing
  • (Justin) A TTS is going to be a part of an AS for a lot of systems, but in more workload focused systems, it could be a part of the "workload bundle" or something. It has less to do with the AS in that case
  • (George) In that context, if my workload needs to initiate 3 different types of transactions by calling 3 different workloads, my service would still go to the TTS to get a TraT for each use case. In the Lambda case, one could provision the Lambda with a pre-injected TraT. A Kubernetes WL would get a workload token at startup, but would need a TraT per request.
  • (Justin) Agree that these are two different things. There isn't going to be just one way for this to show up at runtime. This is different from access tokens
  • (Pieter) We could have at least one TTS per cluster. Having it called out that it can be standalone, or a part of the AS will be interesting. We should call out the security considerations of standalone, so that we don't redo all the work that AS security considerations already cover
  • (Pieter) In WIMSE, a deliverable is local token issuance. There's a model where the TraT is embedded in the workload (the workload is self-issuing)
  • (George) One of the reasons to create this issue is to collect these ideas. We could collect these different ideas, and discuss in Vancouver. People who have deployed something similar already would be useful to get feedback from.
  • (Atul) Can we individually contribute slides that can be discussed in Vancouver?
  • (Justin) Are we collecting these in a Wiki page, or a temporary section of the document that we know we are going to discard.
  • (George) We should just add it to this issue.
  • (Pieter) We should raise this both in the WIMSE and the OAuth WG in Vancouver

Here is how we deployed the TTS in our environment:

Although the TTS is a logical part of the AS system (and provided by the same team), it is a separate microservice than the authorizarion and token endpoints of the AS used by OAuth clients. The auth and token endpoints used by clients are reachable by clients outside of the company network, while the TTS endpoint is only reachable from within the company network. The key to sign the TraT is different from the key material used to encrypt/sign the ATs issued by the externally reachable token endpoint. Hence the "iss" and JWKS uris are different. We have multiple data centers and in each DC the TTS is available. The "iss" used by all TTS is the same for all DCs; a DC agnostic URI. All TTS deployments share the same key material.

Our implementation overlaps with what @obfuscoder described. One interesting thing to think about is debugging and tracing back the token to the AS that issued it. We have token identifier that identifies the region + DC in which it was issued which it makes easier for validating party to know where request came from. Sometimes the request crosses region boundary. We make sure that public key is shared across all regions. Each region+DC uses its own private+public key pair but we make the public key globally available.