Dan Mercer

Software engineer, security and Web enthusiast, follower of Jesus

Advanced OAuth 2.0: false positives in refresh token theft detection

OAuth 2.0 is a framework for authorization on the web, where a user can give one service (known as the client) access to data stored in another service (known as a resource server). The framework explains how to do a three-way “handshake” of sorts, where the user grants access via an authorization server and the client obtains an access token. The client then can use that access token to make authorized requests to the resource server. (A deeper explainer of OAuth 2.0 is beyond the scope of this article, but here’s a really good explanation by Aaron Parecki.)

This article explores a specific edge case that can happen when authorization servers use rotating refresh tokens to detect refresh token theft. That’s a mouthful, so let’s explore those ideas one by one.

What are refresh tokens?

Because access tokens can be used to access protected resources, they are usually short-lived. (Think lifetimes measured in hours.) Refresh tokens are a way for an app to get a new access token without re-prompting the user to grant access. The client sends the refresh token to the authorization server in exchange for a new access token. This is often referred to as “offline access” because it allows the app to continue acting on the user’s behalf even when the user is not present.

Why rotate refresh tokens?

If refresh tokens never expire, then a malicious actor with a stolen refresh token can easily get persistent access to the token’s resources. But if refresh tokens do expire, then apps that should have persistent access to certain resources will need a way to do that. Enter: token rotation. Each time the app uses a refresh token, the authorization server issues a new access token and a new refresh token (with a new expiration time). The authorization server then invalidates the refresh token that was just used, since it’s not needed anymore.

This has a few benefits:

Detecting stolen refresh tokens 🕵

As described above, when refresh token rotation is used, each refresh token should only be used once. Because of that, if the authorization server receives multiple “refresh” requests with the same token, it can assume that one of those two requests was a malicious actor with a stolen token. There’s no way to tell which request was the invalid one, but it can then invalidate both requesters’ tokens - in other words, it invalidates the whole “grant”, requiring the user to reauthorize the client. This stops the attack by invalidating any tokens the malicious actor had stolen from that grant. (See Section 4.14.2 of the current “OAuth 2.0 Security Best Current Practice” document for more about this technique.)

Side note: how clients do refreshing

In practice, there are at least two ways for a client to use refresh tokens.

Many clients choose the lazy approach because (1) it’s easier to implement and (2) it can be less resouce-heavy, because if the access token isn’t used for a while, it won’t be refreshed until it’s needed again.

Where things go wrong: false positives 💥

Maybe you can already see where I’m going with this. The theft detection strategy described above causes a false positive if a legitimate client refreshes a token multiple times. This can easily happen when (1) a client uses lazy refreshing described above and (2) the token is sometimes needed for multiple things at the same time, such as if it’s used by multiple end users at once, or if an end user does two actions concurrently.

For example, imagine a web app that uses OAuth2 to load some data. If the user has multiple tabs open at the same time, each tab tries to request the data, each request invokes a refresh (because the current access token is expired), and whichever refresh happens second then triggers the theft detection, revoking the app’s access.

Here’s a step by step walkthrough of how the problem happens:

Oddly, the current OAuth2 Security BCP doc doesn’t mention this risk, but the older OAuth2 threat model RFC mentions it offhandedly: “This [theft detection] measure may cause problems in clustered environments, since usage of the currently valid refresh token must be ensured.”

In additon, refresh token rotation can cause problems even without the theft detection technique. If a refresh token is used, but the response never makes it to the client (e.g. the network fails to deliver the response), then the client is left with an invalid refresh token and no recourse except asking the user to re-authorize.

Mitigation for clients

To mitigate this problem from the client’s side, you’ll need some kind of locking or mutex around the refresh token. Whenever you refresh a token, you set a lock that tells other threads/flows/etc to wait until it’s refreshed. As database locking tends to be, this can be finicky. You have lots of edge cases to consider. What happens if the process crashes? or the refresh request fails? or times out? or…? If you’re building a generic OAuth2 client system, you probably ought to handle this. But if you’re building an OAuth2 provider (i.e. the authorization server), you should think twice before asking clients to do this. There’s a better way!

Building better authorization servers 💡

A way to solve the false-positive problem on the authorization server’s side is to add a small grace period to refresh token refreshes. After a refresh token is used, for a short window of time, allow it to be used again (and return the same new tokens as the first time it was used). In other words, make the request “idempotent” during that window of time. The window should be no more than 60 seconds or so. Because the “refresh-and-store-new-tokens” process hopefully only takes a few seconds, a minute is plenty of time to allow concurrent requests to settle.

This does weaken the breach detection slightly, but it’s very slight. A malicious actor would have to guess when the true client is going to refresh the token. And even if they guess successfully, as long as you give both of them (the malicious actor and the true client) the same new tokens, then the malicious actor will have to make another lucky guess the next time the client refreshes its tokens.

(Note: due to the tighter security requirements of public clients, you might decide to shorten or disable the grace period for those clients.)

Although this idea isn’t formalized in the spec yet, several authorization servers support some kind of grace period on refresh tokens, including Auth0, Okta, Fitbit, Slack, and Lucid. I’ve also found it in Fauna’s auth “blueprints” code and the Django OAuth toolkit library.

Conclusion

Security is a constant balancing act between risk and usability. Industry standards like OAuth 2.0 are constantly evolving as risks emerge, usability requirements change, and the industry learns as a whole. The best solutions are ones that give benefits on one side of the balance with no harm to the other. If you’re careful about it, implementing theft detection for refresh tokens can be that kind of idea - it can increase the security of your OAuth 2.0 system, and it can be completely transparent to the clients consuming your APIs, no matter how they’re architected.