Race condition when refreshing access tokens in a web app + API scenario
luisviman opened this issue ยท 24 comments
Hi,
We are having some issues when using lua-resty-openidc
for securing a web app and an API. Maybe there is something wrong with the concept we are using, so we will really appreciate your expertise.
We want the users to authenticate via browser. After that, we want the web app to use the retrieved cookie for sending requests to the API.
This concept works fine most of the times. However, we have identified a race condition: Once the access token needs to be refreshed, if the responses of two parallel requests get out of order, the browser will keep an invalid cookie. This will make later requests fail.
Example:
- The access token expires.
- The client app sends 2 parallel requests: request A and request B.
- Request A reaches OpenResty and starts the refresh process.
- Request B reaches OpenResty. The refresh process started by request A has not finished yet, so the stored access token is still the expired one. Because of this, a new refresh process is triggered.
- The process started by request A retrieves a new access token and sends a new cookie (cookie A) to the client.
- The process started by request B retrieves a different access token and sends a different cookie (cookie B) to the client. At this point, cookie A is not valid anymore.
- For some reason, cookie B arrives to the client first. The client's browser will overwrite the old cookie with cookie B.
- Then, cookie A arrives and the browser overwrites cookie B with cookie A.
- The browser will use cookie A in its next request, but since cookie A is not valid, the request will be rejected.
So, is our model incorrect? Has anyone experience this? Any ideas about how to solve it?
Thank you in advance!
Regards,
Luis.
Without completely understanding the 9. steps you listed, I think you bring up a valid point wrt. parallel requests and access token refresh, thanks for that. We should probably keep track of the "busy refreshing" state and temporarily return an error. @bodewig would you agree?
The "current refresh" may not complete at all, so at least we'd have to time out the error and try a new refresh. Also I'm not sure returning an error as response to step 4 would be an improvement from @luisviman's point of view.
One thing I would have expected is that the OIDC Provider rejects the refresh token used for the second refresh process - whichever token request comes second. I know some OPs allow refresh tokens to be used multiple times.
Why is the Cookie set by request A better or worse than the one set by request B? Both contain valid access tokens and valid refresh tokens, don't they?
@bodewig , we are using Redis for session storage, so the cookies doesn't contain any token. I think the cookies just have an identifier for retrieving sessions. The problem is that request B destroys the session previously created by request A, so the session referenced by cookie A doesn't exist anymore.
I think the way of solving this would be to find a way so the refresh process is not triggered more than once. Either having a "busy refreshing" state or configuring the provider to revoke refresh tokens after its first use might solve the issue. I'm looking into the latter. I'll keep you posted.
Anyway, if you want lua-resty-openidc
to support providers that allow refresh tokens to be used multiple times, maybe the "busy refreshing" state is a good idea.
Thank you for your help.
Regards,
Luis.
My bad, of course things are different with server side sessions.
I'm not really sure where to store the "busy refreshing" state. Storing it inside the session won't work with the cookie store and the cache might evict it (and not be distributed in the first place).
Even if we manage to keep this flag, what should happen to your request B - delay the response until A has obtained a new token? I'm not really familiar with the multitasking primitives provided by openresty or nginx but right now I wouldn't know how to do that. And this is even without the possible case of multiple servers and the requests A and B hitting different machines.
@luisviman I don't see why at 6. cookie A is not valid anymore; it is still persisted in Redis, isn't it?
I'm not an expert on lua-resty-session
so, please, correct me if I am wrong.
I think lua-resty-session
stores sessions (containing tokens and such) not cookies. A session can be retrieved using the provided cookie. In the scenario described above, it looks like, once cookie B is created, the session can't be retrieved using cookie A anymore. That's why I'm saying cookie A is not valid at step 6.
Maybe, if the session could be retrieved using cookie A, the issue would be fixed, but I'm not sure if this solution would generate other security concerns.
Thank you.
Regards,
Luis.
I agree with your process description but from what I can see in the lua-resty-session
Redis backend code, it doesn't delete the first key/value pair from Redis storage so it should still be available when referred to by cookie A
As far as I know, lua-resty-session
stores the session using the same key during all the session lifespan. When refreshing the access token the session is not deleted, but it is updated. Because of this, the hash included in cookie A won't match anymore (session.lua - line 288), so you won't be able to retrieve the session using cookie A even if the session is still persisted in Redis.
Thank you.
Regards,
Luis.
if the hash included in a cookie changes, the cookie changes...
Yes. So Request A updates the session and generates cookie A. Then request B updates the session again and generates cookie B. At this point you can't retrieve the session using cookie A because the hash included in cookie A won't match. Does this make sense?
Thank you for your help.
Regards,
Luis.
The hash included in cookie A still refers to a valid entry in Redis, no?
Excuse me, I think I am not clearly explaining the issue.
This is the content of the cookies returned by lua-resty-session
:
fAG-n8bsaYsLVyMJbGTtgg..|1538391895|KL5lwRcITTYT9DZt6W9CvIQenFI.
fAG-n8bsaYsLVyMJbGTtgg..
- Session ID.1538391895
- Expiration timestamp.KL5lwRcITTYT9DZt6W9CvIQenFI.
- Session content hash.
The Session ID included in cookie A refers to a valid entry in Redis, but since the Session content hash doesn't match (the session content was updated by request B) lua-resty-session
won't return the session. You can check this in session.lua line 288. Because of the Session content hash check, using cookie B is the only way to retrieve the session.
@zandbelt , does this make more sense? Sorry for not properly stating the issue before. I appreciate your help.
Regards,
Luis.
@luisviman no problem, we'll get there ;-) let me try and wrap my head around this...
I had the same problem when using redis storage, I ended up modifying lua-resty-session to allow the session data to be modified in redis without invalidating the existing cookie. It only validates the session retrieved contains the expected user ID.
As the hmac is now only looking at the user ID it doesn't change on a token refresh so the only thing that will change in the cookie is the expiration timestamp.
It doesn't include the expiration timestamp in the hash calculation either, instead I have openidc.lua set the session lifetime to the token lifetime so the session will disappear from redis when it expires.
These changes make it insecure to use with the cookie storage driver as it would allow tampering. You can see the changes at bungle/lua-resty-session@master...inomial:master
One downside is if multiple requests come in at the same time when a token refresh is due it can end up refreshing the token more than once, this doesn't cause a problem but wastes resources.
A simpler way that might be enough to solve your problem would be to set a new session.id in openidc.lua when the token is refreshed. This should leave the old session in redis so old cookies will still be accepted. Although if openidc.lua does see an old cookie it would try to refresh the token again so you'd end up with some extra sessions in redis when the race condition occurs.
@bodewig and I discussed this and it looks like we can push this problem further down the road by creating a new session on each token refresh. That will make both sessions valid until the firstly refreshed access token expires and can no longer be refreshed.
We cannot find a way to address this race condition once and for all across all backends: the backend would need to support distributed locking for that. In principle Redis can do this but we'd have to code specifically for Redis which is out-of-scope here but something someone may pickup in a branch.
Therefore I propose to apply the fix as above and file this as a "known issue".
@bodewig and I discussed this and it looks like we can push this problem further down the road by creating a new session on each token refresh. That will make both sessions valid until the firstly refreshed access token expires and can no longer be refreshed.
@zandbelt, @bodewig, I think your solution will fix the issue. Are there plans for implementing it?
I would like to point something out: since it will create a new session on each token refresh, if you reuse the same refresh token multiple times (and the IDP supports it) there will be multiple valid sessions at the same time. I don't see a problem there anyway. Just pointing it out. :)
@luisviman we see indeed no problems with multiple parallel sessions, but there's two remarks to the refresh token behavior of OPs:
- an OP may swap refresh tokens and revoke the older one, that's good security hygiene and I believe lots of OPs do this by default
- an OP may revoke the old access token upon issuing a new one
IOW your mileage may vary across OP implementations
If I'm reading lua-resty-session's API documentation correctly than calling regenerate
rather than save
should do the trick - see #209
@luisviman @ross211 could anybody of you please give the corresponding branch a try and see whether it work as intended? Right now I haven't got a setup with server side session storage to test it properly.
@bodewig, I'm testing your solution and so far so good. Please, give me a little more time to do some extra tests.
Thank you!
sure, take your time. I know it's hard to assert there is no race condition anymore. Thank you.
@bodewig, I have found no issues.
Do you plan releasing a new version of lua-resty-openidc
including this fix anytime soon?
Thank you guys! You are awesome. :)
Many thanks for testing @luisviman - there should be a release soonish IIRC.
yes, everything that we need for 1.7.0 is in place now; give us a few days (and don't be afraid to poll if nothing has happened by then ;-))
Fantastic to see thisโit looks to be at the core of an issue I'm experiencing as well. Thanks to everyone on this thread!