Why Does OAuth v2 Have Both Access and Refresh Tokens?
Asked Answered
R

22

836

Section 4.2 of the draft OAuth 2.0 protocol indicates that an authorization server can return both an access_token (which is used to authenticate oneself with a resource) as well as a refresh_token, which is used purely to create a new access_token:

https://www.rfc-editor.org/rfc/rfc6749#section-4.2

Why have both? Why not just make the access_token last as long as the refresh_token and not have a refresh_token?

Rubie answered 15/8, 2010 at 15:25 Comment(0)
S
551

The idea of refresh tokens is that if an access token is compromised, because it is short-lived, the attacker has a limited window in which to abuse it.

Refresh tokens, if compromised, are useless because the attacker requires the client id and secret in addition to the refresh token in order to gain an access token.

Having said that, because every call to both the authorization server and the resource server is done over SSL - including the original client id and secret when they request the access/refresh tokens - I am unsure as to how the access token is any more "compromisable" than the long-lived refresh token and clientid/secret combination.

This of course is different to implementations where you don't control both the authorization and resource servers.

Here is a good thread talking about uses of refresh tokens: OAuth Archives.

A quote from the above, talking about the security purposes of the refresh token:

Refresh tokens... mitigates the risk of a long-lived access_token leaking (query param in a log file on an insecure resource server, beta or poorly coded resource server app, JS SDK client on a non https site that puts the access_token in a cookie, etc)

Stickup answered 26/8, 2011 at 18:52 Comment(14)
Catchdave is right but thought I would add that things have evolved since his initial reply. The use of SSL is now optional (this was probably still being debated when catchdave answered). For example, MAC tokens (currently under development), provide the ability to sign the request with a private key so that SSL is not required. Refresh tokens thus become very important since you want to have short-lived mac tokens.Glisson
"Refresh tokens, if compromised, are useless because the attacker requires the client id and secret in addition to the refresh token in order to gain an access token." But the client id and secret is also stored in the device, isn't it? So an attacker with access to the device can get them. Then why? Here, github.com/auth0/lock/wiki/Using-a-Refresh-Token , It is written that loosing a Refresh token means, he can requests as many auth tokens as he want, may be not in the googles scenario, but what if I am implementing my own oauth2 server?Sabin
"The attacker requires the client id and secret in addition to the refresh token in order to gain an access token": then what's the difference between using a refresh token and simply resigning in?Bagasse
Refresh token can be used by a third party that can renew the access token without any knowledge of user credentials.Octastyle
@MarekDec I thought the client id+secret and users credentials were the same thing? Do you need the id+secret/user credentials to get a new access token or not?Fleeman
@KevinWheeler No, the client ID and secret are credentials for the OAuth client, not the user. When talking about OAuth the "client" is usually a server (for example the stackoverflow web server) which interfaces with an authorization or resource API server (for example the facebook auth provider). The user's credentials are only passed between the user and the OAuth API server, and never known to the client. The client secret is only passed from the client to the OAuth API server, and is never known to the user.Candlemas
@machineyearning If my understanding is right, isn't knowing the shared secret all that's needed for the auth server to sign and give out a new, valid access token? If the client is already sending it's shared secret along with the refresh token, then what role does the refresh token play in the process? Is there any data inside a refresh token that becomes beneficial in the refresh step, other than I guess the auth server knowing if it's an expired token or not?Burkes
@Burkes You need to pass 2 challenges to be issued an access token. First you need to prove you're a registered client, which always happens by specifying your client ID and secret. Second you need to prove you have permission to access the user's resource on his behalf. This second part happens in 2 ways in the auth code flow. At first you get the user to log in directly to the auth provider, and you get back a code to your redirect URI. Subsequently you use the refresh token instead of the auth code, so the user doesn't have to login again once the original access token expires.Candlemas
@machineyearning I kind of get the initial auth part that involves logins/credentials to retrieve access and refresh tokens. So the refresh token just doubles as your credentials, without actually requiring your credentials again but client id/secrets are still required? Does the refresh step usually involve dealing with state? Like checking if the refresh token is still valid/not revoked if you have such an option implemented?Burkes
@Burkes yeah you can kinda think of it as a reference to the previous user access grant, which would imply not only the user's credentials have been entered but also specifically which resources they granted you access to. I've seen the refresh token be issued along with some metadata before, such as TTL, if that's what you're wondering. But usually the auth server MUST give you a relevant error response when you try to use an expired, revoked, or otherwise invalid refresh token. Look at the oauth2 spec section 4.1 "authorization code grant". It's pretty readable as far as specs goCandlemas
If the access token was compromised, we can most likely assume the refresh token and the rest of the credentials were too, this whole overly unnecessarily complex authentication flow is absolutely pointless.Cyrstalcyrus
I have a question, do we expire the old refresh token on every login by user ? Lets suppose I have an expired Auth token, so I will call refresh token to get new auth token, but if the user has logged in again and the refresh that I have is expired, won't this whole concept of refresh token fail ? Should I not mark refresh token expired on every login by userUngrudging
Can a Public Client refresh token? Since it has no secret, what does it include in refresh request for authentication?Kohl
"I am unsure as to how the access token is any more "compromisable" than the long-lived refresh token and clientid/secret combination" - I can imagine this difference being real in cases where the "resource server" (which never sees refresh tokens) is less secure than the authorisation server. For instance, one can imagine the "resource server" at a tech giant really being many different resource servers implemented with different tech stacks, any one of which could be leaking access tokens to a weakly-defended log server or something. The authorisation server is likely easier to harden.Auliffe
L
673

The link to discussion, provided by Catchdave, has another valid point (original, dead link) made by Dick Hardt, which I believe is worth to be mentioned here in addition to what's been written above:

My recollection of refresh tokens was for security and revocation. <...>

revocation: if the access token is self contained, authorization can be revoked by not issuing new access tokens. A resource does not need to query the authorization server to see if the access token is valid.This simplifies access token validation and makes it easier to scale and support multiple authorization servers. There is a window of time when an access token is valid, but authorization is revoked.

Indeed, in the situation where Resource Server and Authorization Server is the same entity, and where the connection between user and either of them is (usually) equally secure, there is not much sense to keep refresh token separate from the access token.

Although, as mentioned in the quote, another role of refresh tokens is to ensure the access token can be revoked at any time by the User (via the web-interface in their profiles, for example) while keeping the system scalable at the same time.

Generally, tokens can either be random identifiers pointing to the specific record in the Server's database, or they can contain all information in themselves (certainly, this information have to be signed, with MAC, for example).

How the system with long-lived access tokens should work

The server allows the Client to get access to User's data within a pre-defined set of scopes by issuing a token. As we want to keep the token revocable, we must store in the database the token along with the flag "revoked" being set or unset (otherwise, how would you do that with self-contained token?) Database can contain as much as len(users) x len(registered clients) x len(scopes combination) records. Every API request then must hit the database. Although it's quite trivial to make queries to such database performing O(1), the single point of failure itself can have negative impact on the scalability and performance of the system.

How the system with long-lived refresh token and short-lived access token should work

Here we issue two keys: random refresh token with the corresponding record in the database, and signed self-contained access token, containing among others the expiration timestamp field.

As the access token is self-contained, we don't have to hit the database at all to check its validity. All we have to do is to decode the token and to validate the signature and the timestamp.

Nonetheless, we still have to keep the database of refresh tokens, but the number of requests to this database is generally defined by the lifespan of the access token (the longer the lifespan, the lower the access rate).

In order to revoke the access of Client from a particular User, we should mark the corresponding refresh token as "revoked" (or remove it completely) and stop issuing new access tokens. It's obvious though that there is a window during which the refresh token has been revoked, but its access token may still be valid.

Tradeoffs

Refresh tokens partially eliminate the SPoF (Single Point of Failure) of Access Token database, yet they have some obvious drawbacks.

  1. The "window". A timeframe between events "user revokes the access" and "access is guaranteed to be revoked".

  2. The complication of the Client logic.

    without refresh token

    • send API request with access token
    • if access token is invalid, fail and ask user to re-authenticate

    with refresh token

    • send API request with access token
    • If access token is invalid, try to update it using refresh token
    • if refresh request passes, update the access token and re-send the initial API request
    • If refresh request fails, ask user to re-authenticate

I hope this answer does make sense and helps somebody to make more thoughtful decision. I'd like to note also that some well-known OAuth2 providers, including github and foursquare adopt protocol without refresh tokens, and seem happy with that.

Labannah answered 14/10, 2012 at 19:38 Comment(5)
@RomannImankulov If I understand it correctly refreshe token we can save into db and delete them any time we want to revoke the access, so why don't save acces tokens it self ?Afra
@Afra the short version of my post is, if you save the access token in the database, you hit the database on every request to your API (which may or may not be a problem in your particular case). If you save refresh tokens and keep access tokens "self-contained", you hit the database only when the client decides to refresh the access token.Labannah
Personally I don't like this approach of not hitting the database to gain performance if it is going to compromise security (even if only for the timespan of the window). One should be able to revoke an access_token immediately if necessary as almost always we are dealing with sensitive user information (otherwise we would likely not be using OAuth in the first place). I wonder which approach bigger companies like Facebook and Google use.Antitoxin
"Nonetheless, we still have to keep the database of refresh tokens" -> No? We can just keep a database of access tokens, but only hit it once we receive an access token that is expired! Right? Or I'm missing something.Spindrift
@Spindrift What would the purpose of storing expired access tokens be? The reason we store refresh tokens in a database is for a blacklist - to invalidate future access tokens from being created using a specific refresh token. Storing refresh tokens is specifically for invalidation, not validation. If you're implying that access tokens can make other access tokens - they can't, only refresh tokens are able to generate new access tokens without requiring the user to manually reauthenticate.Almagest
S
551

The idea of refresh tokens is that if an access token is compromised, because it is short-lived, the attacker has a limited window in which to abuse it.

Refresh tokens, if compromised, are useless because the attacker requires the client id and secret in addition to the refresh token in order to gain an access token.

Having said that, because every call to both the authorization server and the resource server is done over SSL - including the original client id and secret when they request the access/refresh tokens - I am unsure as to how the access token is any more "compromisable" than the long-lived refresh token and clientid/secret combination.

This of course is different to implementations where you don't control both the authorization and resource servers.

Here is a good thread talking about uses of refresh tokens: OAuth Archives.

A quote from the above, talking about the security purposes of the refresh token:

Refresh tokens... mitigates the risk of a long-lived access_token leaking (query param in a log file on an insecure resource server, beta or poorly coded resource server app, JS SDK client on a non https site that puts the access_token in a cookie, etc)

Stickup answered 26/8, 2011 at 18:52 Comment(14)
Catchdave is right but thought I would add that things have evolved since his initial reply. The use of SSL is now optional (this was probably still being debated when catchdave answered). For example, MAC tokens (currently under development), provide the ability to sign the request with a private key so that SSL is not required. Refresh tokens thus become very important since you want to have short-lived mac tokens.Glisson
"Refresh tokens, if compromised, are useless because the attacker requires the client id and secret in addition to the refresh token in order to gain an access token." But the client id and secret is also stored in the device, isn't it? So an attacker with access to the device can get them. Then why? Here, github.com/auth0/lock/wiki/Using-a-Refresh-Token , It is written that loosing a Refresh token means, he can requests as many auth tokens as he want, may be not in the googles scenario, but what if I am implementing my own oauth2 server?Sabin
"The attacker requires the client id and secret in addition to the refresh token in order to gain an access token": then what's the difference between using a refresh token and simply resigning in?Bagasse
Refresh token can be used by a third party that can renew the access token without any knowledge of user credentials.Octastyle
@MarekDec I thought the client id+secret and users credentials were the same thing? Do you need the id+secret/user credentials to get a new access token or not?Fleeman
@KevinWheeler No, the client ID and secret are credentials for the OAuth client, not the user. When talking about OAuth the "client" is usually a server (for example the stackoverflow web server) which interfaces with an authorization or resource API server (for example the facebook auth provider). The user's credentials are only passed between the user and the OAuth API server, and never known to the client. The client secret is only passed from the client to the OAuth API server, and is never known to the user.Candlemas
@machineyearning If my understanding is right, isn't knowing the shared secret all that's needed for the auth server to sign and give out a new, valid access token? If the client is already sending it's shared secret along with the refresh token, then what role does the refresh token play in the process? Is there any data inside a refresh token that becomes beneficial in the refresh step, other than I guess the auth server knowing if it's an expired token or not?Burkes
@Burkes You need to pass 2 challenges to be issued an access token. First you need to prove you're a registered client, which always happens by specifying your client ID and secret. Second you need to prove you have permission to access the user's resource on his behalf. This second part happens in 2 ways in the auth code flow. At first you get the user to log in directly to the auth provider, and you get back a code to your redirect URI. Subsequently you use the refresh token instead of the auth code, so the user doesn't have to login again once the original access token expires.Candlemas
@machineyearning I kind of get the initial auth part that involves logins/credentials to retrieve access and refresh tokens. So the refresh token just doubles as your credentials, without actually requiring your credentials again but client id/secrets are still required? Does the refresh step usually involve dealing with state? Like checking if the refresh token is still valid/not revoked if you have such an option implemented?Burkes
@Burkes yeah you can kinda think of it as a reference to the previous user access grant, which would imply not only the user's credentials have been entered but also specifically which resources they granted you access to. I've seen the refresh token be issued along with some metadata before, such as TTL, if that's what you're wondering. But usually the auth server MUST give you a relevant error response when you try to use an expired, revoked, or otherwise invalid refresh token. Look at the oauth2 spec section 4.1 "authorization code grant". It's pretty readable as far as specs goCandlemas
If the access token was compromised, we can most likely assume the refresh token and the rest of the credentials were too, this whole overly unnecessarily complex authentication flow is absolutely pointless.Cyrstalcyrus
I have a question, do we expire the old refresh token on every login by user ? Lets suppose I have an expired Auth token, so I will call refresh token to get new auth token, but if the user has logged in again and the refresh that I have is expired, won't this whole concept of refresh token fail ? Should I not mark refresh token expired on every login by userUngrudging
Can a Public Client refresh token? Since it has no secret, what does it include in refresh request for authentication?Kohl
"I am unsure as to how the access token is any more "compromisable" than the long-lived refresh token and clientid/secret combination" - I can imagine this difference being real in cases where the "resource server" (which never sees refresh tokens) is less secure than the authorisation server. For instance, one can imagine the "resource server" at a tech giant really being many different resource servers implemented with different tech stacks, any one of which could be leaking access tokens to a weakly-defended log server or something. The authorisation server is likely easier to harden.Auliffe
M
277

Despite all the great answers above, I as a security master student and programmer who previously worked at eBay when I took a look into buyer protection and fraud, can say to separate access token and refresh token has its best balance between harassing user of frequent username/password input and keeping the authority in hand to revoke access to potential abuse of your service.

Think of a scenario like this. You issue user of an access token of 3600 seconds and refresh token much longer as one day.

  1. The user is a good user, he is at home and gets on/off your website shopping and searching on his iPhone. His IP address doesn't change and have a very low load on your server. Like 3-5 page requests every minute. When his 3600 seconds on the access token is over, he requires a new one with the refresh token. We, on the server side, check his activity history and IP address, think he is a human and behaves himself. We grant him a new access token to continue using our service. The user won't need to enter again the username/password until he has reached one day life-span of refresh token itself.

  2. The user is a careless user. He lives in New York, USA and got his virus program shutdown and was hacked by a hacker in Poland. When the hacker got the access token and refresh token, he tries to impersonate the user and use our service. But after the short-live access token expires, when the hacker tries to refresh the access token, we, on the server, has noticed a dramatic IP change in user behavior history (hey, this guy logins in USA and now refresh access in Poland after just 3600s ???). We terminate the refresh process, invalidate the refresh token itself and prompt to enter username/password again.

  3. The user is a malicious user. He is intended to abuse our service by calling 1000 times our API each minute using a robot. He can well doing so until 3600 seconds later, when he tries to refresh the access token, we noticed his behavior and think he might not be a human. We reject and terminate the refresh process and ask him to enter username/password again. This might potentially break his robot's automatic flow. At least makes him uncomfortable.

You can see the refresh token has acted perfectly when we try to balance our work, user experience and potential risk of a stolen token. Your watch dog on the server side can check more than IP change, frequency of api calls to determine whether the user shall be a good user or not.

Another word is you can also try to limit the damage control of stolen token/abuse of service by implementing on each api call the basic IP watch dog or any other measures. But this is expensive as you have to read and write record about the user and will slow down your server response.

Megasporophyll answered 18/6, 2016 at 19:58 Comment(8)
These are some great policies and ideas, but I don't see anything in your answer that inherently requires the use of refresh tokens. All of these features can be implemented with just the access token.Leopoldoleor
@Evert, one of the benefits of using both access and refresh tokens is that access tokens can be short-lived and therefore it is not too much of a security compromise to trust them unconditionally without checking with the server that originally issued them. This can allow you to scale your infrastructure so that non-critical parts of it can trust the information stored in the (signed) token without direct access to the user's account information.Consignor
@Avi Cherry - Yes an access token can be short-lived, and it can also be refreshed if the user is still considered valid. Doesn't require a refresh token to do that.Galumph
I believe this answer assumes that we never want resource servers to do advanced access control themselves (e.g. check IP activity against various databases, etc), and instead that they can only rely on verifying the access token in complete isolation. While this might be obvious at scale (for performance reasons) it clearly isn't obvious for everybody here given the confusion in other posts and comments. It's a good post with nice information but I feel it misses the point of the original question greatly. I recommend at least making the aforementioned assumption explicit.Scottie
@RickJolly Regarding "access token can be short-lived, and it can also be refreshed if the user is still considered valid" - I think that what the answerer was getting at is that a customer may be on ebay browsing - lets say shoes - 3 or 4 times in a 24 hour period. The login must stay 'valid' for the whole day or the user will get highly frustrated having to keep logging in. If they're on a mobile device then there's no 'refreshing' anything if the user isn't active in that browser tab. But his access token which is more expensive to obtain only needs refreshing 3 or 4 times.Rhondarhondda
@Leopoldoleor If the scenarios are implemented with just the access token then you will need to prolong the access token to 1day. In that interval you will not be able to do additional checks with authorization server. If you do those checks on every request this means that you actually use a refresh token. If you do that you will load the authorization server (think on 1000 calls/s and on each ask the auth server if the refresh token is ok and multiply that with number of users).Tracery
@Tracery yes, my comment was not against the use of refresh token. That said, the scaling a simple primary-key based lookup is not a concern for the vast majority of people and probably shouldn't be optimized.Leopoldoleor
This answer is 7 years old, but still I feel like I have to jump into the flaws here because the general assumptions were wrong 7 years ago already, so people might still think they are true today. Usage of refresh tokens does not help at all in a scenario where the access token gets compromised by a "hacked" user. If the hacker has access to the access token, he most likely has access to the refresh token as well - or even the keyboard input and thus might record the actual user credentials. Additionally, deeming someone careless who for not using an antivirus leaves a bad note as well.Colewort
R
90

Neither of these answers get to the core reason refresh tokens exist. Obviously, you can always get a new access-token/refresh-token pair by sending your client credentials to the auth server - that's how you get them in the first place.

So the sole purpose of the refresh token is to limit the use of the client credentials being sent over the wire to the auth service. The shorter the TTL of the access-token, the more often the client credentials will have to be used to obtain a new access-token, and therefore the more opportunities attackers have to compromise the client credentials (although this may be super difficult anyway if asymmetric encryption is being used to send them). So if you have a single-use refresh-token, you can make the TTL of access-tokens arbitrarily small without compromising the client credentials.

Revolt answered 2/8, 2012 at 19:11 Comment(11)
This is interesting as in Google's case when you ask for a refresh token, you also send over the client id and client secret. So you're compromising every hour anyway.Divertimento
"sole purpose" - doesn't wash. Making the TTL of the access-token as long as that of the imagined refresh-token will achieve just the same.Lepus
@Lepus You're right that what you suggest limits the use of the client credentials, however it also has the problem that the longer the ttl on the access-token means its more likely the token can be stolen and then used as credentials. You don't have this problem with the use of a long-lived refresh-token and short-lived access-token.Revolt
Refresh tokens also improve scalability and partially mitigate against a single point of failure, so "sole purpose" is not correct.Wandawander
If you have a mobile application (iOS/Android), wouldn't it be pointless to have a refresh_token? I mean, both access_token and refresh_token would have to be stored in that app and then used. If one was compromised, the other would also be. So aren't refresh_tokens kind of pointless then in mobile?Bluish
If you are able to steal the users' credentials in transit, you're also able to steal the refresh token + client secret, which is arguably even worse. Getting hold of a refresh token by e.g. gaining access to the server DB on the other hand means nothing if you don't have the client secret. So that's why the refresh tokens are a secure way of allowing refreshes. The alternative is storing user passwords in that DB, which is obviously a big nono.Germinative
Since the standard requires that the client credentials be sent along with the refresh token, the premise of this answer is simply false. "Because refresh tokens are typically long-lasting credentials used to request additional access tokens... the client MUST authenticate with the authorization server." Also see the comment by @Rots.Mcconnell
A) I think you are mixing up client secrets and user secrets. The client secret is never sent from the user device, only from the accessing backend application to the data providing backend application. B) The oAuth server that allows for password grant for a Public Client (a client that cannot keep a client secret such as a native or javascript app) will also provide a refresh-token grant for that public client, thus you do not need to send a client secret when refreshing your token. C) The refresh-token provides the backend with a "hart-beat" when to check the validity of the user!Kamal
this answer is wrong for the reason that Andreas Lundgren statesBrody
The original question can be put in some context then. For example, why use refresh token to get a new access token, when you use client credentials and the client can be used as a service account? So not involving any human interaction, to me it seems it makes the refresh token pointless.Sillabub
@Sillabub because then you would be sending client credentials with every request, reintroducing some issues that oAuth was designed to address (e.g. compromised password)Hopehopeful
G
86

To clear up some confusion you have to understand the roles of the client secret and the user password, which are very different.

The client is an app/website/program/..., backed by a server, that wants to authenticate a user by using a third-party authentication service. The client secret is a (random) string that is known to both this client and the authentication server. Using this secret the client can identify itself with the authentication server, receiving authorization to request access tokens.

To get the initial access token and refresh token, what is required is:

  • The user ID
  • The user password
  • The client ID
  • The client secret

To get a refreshed access token however the client uses the following information:

  • The client ID
  • The client secret
  • The refresh token

This clearly shows the difference: when refreshing, the client receives authorization to refresh access tokens by using its client secret, and can thus re-authenticate the user using the refresh token instead of the user ID + password. This effectively prevents the user from having to re-enter his/her password.

This also shows that losing a refresh token is no problem because the client ID and secret are not known. It also shows that keeping the client ID and client secret secret is vital.

Germinative answered 4/3, 2016 at 9:36 Comment(3)
I don't understand the last sentence. Client id and client secret can also be lost perhaps not so easily due to TLS. Most usually they are lost by some careless public applications like mobile apps and websites exposing these to the world. With this in mind the only thing refresh token does is to reduce the exposure of username:password chain in network communication between other potentially careless services.Ancheta
a website is unlikely to lose these because they're handled solely in the backend, not in the frontend. Mobile apps are a different story indeed.Germinative
It is uncommon yet I had several of these webapp frontends to fix. Maybe because password grant type (single-step) is easier to implement than the authorization code (2-step).Ancheta
P
60

This answer has been put together by the help of two senior devs (John Brayton and David Jennes).

The main reason to use a refresh token is to reduce the attack surface.

Let's assume there is no refresh key and go through this example:

A building has 80 doors. All doors are opened with the same key. The key changes every 30 minutes. At the end of the 30 minutes I have to give the old key to the keymaker and get a new key.

If I’m the hacker and get your key, then at the end of the 30 minutes, I’ll courier that to the keymaker and get a new key. I’ll be able to continuously open all doors regardless of the key changing.

Question: During the 30 minutes, how many hacking opportunities did I have against the key? I had 80 hacking opportunities, each time you used the key (think of this as making a network request and passing the access token to identify yourself). So that’s 80X attack surface.

Now let’s go through the same example but this time let’s assume there’s a refresh key.

A building has 80 doors. All doors are opened with the same key. The key changes every 30 minutes. To get a new key, I can’t pass the old key (access token). I must only pass what's considered a refresh key (refresh token).

If I’m the hacker and get your key, I can use it for 30 minutes, but at the end of the 30 minutes sending it to the keymaker has no value. If I did, then the keymaker would just say "This token is expired. You need to send me a refresh token instead" To be able to extend my hack I would have to hack the courier to the keymaker. The courier has a distinct key (think of this as a refresh token).

Question: During the 30 minutes, how many hacking opportunities did I have against the refresh key? 80? No. I only had 1 hacking opportunity. During the time the courier communicates with the keymaker. So that’s 1X attack surface. I did have 80 hacking opportunities against the key, but they are no good after 30 minutes.


A server would verify an access token based on credentials and signing of (typically) a JWT.

An access token leaking is bad, but once it expires it is no longer useful to an attacker. A refresh token leaking is far worse, but presumably it is less likely. (I think there is room to question whether the likelihood of a refresh token leaking is much lower than that of an access token leaking, but that’s the idea.)

Point is that the access token is added to every request you make, whereas a refresh token is only used during the refresh flow So less chance of a MITM seeing the token

Frequency helps an attacker by making leaking slightly more possible.

  • Heartbleed-like potential security flaws in SSL
  • potential security flaws in the client,
  • potential security flaws in the server

In addition, if the authorization server is separate from the application server processing other client requests then that application server will never see refresh tokens. It will only see access tokens that will not live for much longer.

Compartmentalization is good for security.

Last but not least refresh tokens can get rotated. Meaning 'a new refresh token is returned each time the client makes a request to exchange a refresh token for a new access token.'. As refresh tokens are continually exchanged and invalidated, the threat is reduced. To give you an example: Tokens are usually expired after a TTL usually an hour.

Refresh tokens not always, but often are revoked upon usage and a new one issued. Meaning if you ever have a network failure, when you're retrieving the new refresh token, then the next time you send that refresh token, it's considered revoked and you have to sign in.

For more on rotation see here and here

Summary

  • Reducing Frequency
  • Compartmentalization
  • Rotation (quicker invalidation) and more granular management (expiration time or number of requests made) of tokens.

All help to to mitigate threats

For another take on this see this awesome answer


What refresh token is NOT about?

The ability to update/revoke access level through refresh tokens is a byproduct of choosing to use refresh tokens, otherwise a standalone access token could be revoked or have its access level modified when it expires and users gets a new token

Pergrim answered 15/8, 2019 at 0:19 Comment(11)
Also refresh tokens can be invalidated in which case the person needs to identify themself to the courier before getting a new refresh key. And to keep this refresh key even more secure, you can implement so called "refresh token rotation" where each time the access token is asked, also a new refresh key is given. If you or the hacker goes to the courier with old refresh key the courier invalidates also the latest new refresh key and no one gets new access tokens anymore.Spondee
@JaniSiivola great note on the rotation. I just mentioned that in the answerPergrim
Why it is harder for an attacker to get the refresh token than the access token? Using HTTPS gives protection during the transit, but I need to store both of them in the browser in case of a SPA. So the attacker can steel both of them. Do I miss something?Scorn
@Scorn from that context I don't think they differ. But from the context of the transit layer between the browser, router, ISP, VPN etc, access token can be passed 1000 times an hour while refresh token gets passed only once.Pergrim
In you example, you use your old token (old key) to get your new token (new key). That's not how you would get an access token without the refresh token mechanism : you'd have to send back you password each time you need a new access token.Trichina
Having both Access and Refresh tokens means the attacker has more chance to access the user's account by being able to guess one of them. If that's not the case, why an attacker wouldn't be able to hack your refresh token if he/she is able to hack your access token from client side.Edgebone
@Edgebone good question. I'm not 100% sure. Here are some of my thoughts: On mobile clients, often the access-token and refresh-token are stored in disk. And later pulled into memory. Hacking the memory is different from hacking the disk. I'm honestly not sure which one is easier. Beyond that, it's not necessarily that the client is compromised. The network layer can be compromised as well. Obviously lesser frequency on sending the refresh-token exposes you less — in the long run.Pergrim
Both access and refresh tokens can be accessed by an attacker. The key is - with long-lived access tokens, both victim and attacker continue to operate. However, with short-lived access tokens and refresh token rotation, the second a refresh token is used twice, the refresh token ceases to operate and both parties lose access.Unimpeachable
@Unimpeachable correct. That's the third point I made.Pergrim
@Edgebone it depends on how the client manages them. Like you could store your refresh token in the iOS/macOS Keychain while you store the access token in a less secure location. You might obfuscate how you store the refresh token. But assuming that you stored them in identical ways, then frequency is another factor.Pergrim
@Edgebone Also there are server side attacks. Access token can be passed to a resource server (e.g. getting the latest images), while the refresh token will never (or at least shouldn't ever) get passed to the resource server. Refresh token should only get passed to authorization server. Now imagine if theese two servers had different level of sercurity or where written by two different companies. Because of that the threat model is different between the two. (I tried mentioning this in the answer, but I went to more detail here)Pergrim
Q
44

This answer is from Justin Richer via the OAuth 2 standard body email list. This is posted with his permission.


The lifetime of a refresh token is up to the (AS) authorization server — they can expire, be revoked, etc. The difference between a refresh token and an access token is the audience: the refresh token only goes back to the authorization server, the access token goes to the (RS) resource server.

Also, just getting an access token doesn’t mean the user’s logged in. In fact, the user might not even be there anymore, which is actually the intended use case of the refresh token. Refreshing the access token will give you access to an API on the user’s behalf, it will not tell you if the user’s there.

OpenID Connect doesn’t just give you user information from an access token, it also gives you an ID token. This is a separate piece of data that’s directed at the client itself, not the AS or the RS. In OIDC, you should only consider someone actually “logged in” by the protocol if you can get a fresh ID token. Refreshing it is not likely to be enough.

For more information please read http://oauth.net/articles/authentication/

Quintile answered 30/8, 2015 at 23:4 Comment(1)
This seems to be about OpenID Connect and authentication, so I don't see how this answers the question, which is about the motivation for having token refresh.Banky
G
20

Clients can be compromised in many ways. For example a cell phone can be cloned. Having an access token expire means that the client is forced to re-authenticate to the authorization server. During the re-authentication, the authorization server can check other characteristics (IOW perform adaptive access management).

Refresh tokens allow for a client only re-authentication, where as re-authorize forces a dialog with the user which many have indicated they would rather not do.

Refresh tokens fit in essentially in the same place where normal web sites might choose to periodically re-authenticate users after an hour or so (e.g. banking site). It isn't highly used at present since most social web sites don't re-authenticate web users, so why would they re-authenticate a client?

Gorge answered 18/8, 2012 at 18:40 Comment(1)
"Refresh tokens allow for a client only re-authentication..." is an important aspect here.Costanzia
S
16

To further simplify B T's answer: Use refresh tokens when you don't typically want the user to have to type in credentials again, but still want the power to be able to revoke the permissions (by revoking the refresh token)

You cannot revoke an access token, only a refresh token.

Seriatim answered 27/4, 2015 at 18:2 Comment(3)
You can revoke an access token, which will require either logging in again for another access token or using the refresh token to obtain another access token. If the refresh token was invalid, the user will have to re-authenticate to get a get a new access token along with a new refresh token.Fornix
I disagree. An access token is issued by the auth server, signed with an expiry date, and sent to the client. When the client sends that token to the resource server, the resource server does not contact the auth server to verify the token; it just looks at the expiry date in the (signed and un-tampered) token. So no matter what you do at the auth server to try to 'revoke', the resource server doesn't care. Some people refer to the client logout as a revoke (ie client deletes its token) but imho this is misleading terminology - we want to 'revoke' a token at the server, not the clientSeriatim
Not saying that you couldn't write custom code to ignore certain tokens (like here #22708546) but doing that probably involves some network trips from the resource server to the oauth server/db each time the client makes a call. You avoid those calls by using refresh tokens instead, and I think is more in line with what the oauth authors intended.Seriatim
S
14

Why not just make the access_token last as long as the refresh_token and not have a refresh_token?

In addition to great answers other people have provided, there is another reason why we would use refresh tokens and it's to do with claims.

Each token contains claims which can include anything from the user's name, their roles, or the provider which created the claim. As a token is refreshed, these claims are updated.

If we refresh the tokens more often, we are obviously putting more strain on our identity services; however, we are getting more accurate and up-to-date claims.

Superpatriot answered 19/1, 2016 at 15:36 Comment(3)
It would be an unusual bad practice to put such "claims" in the access token. As described in the specification, the access token "is usually opaque to the client". Do you have examples of OAuth providers that do this?Mcconnell
@Superpatriot When user role is downgraded from ADMIN to REGULAR_USER expectation is that user role needs to be revoked immediately and not when access_token expires. So, it looks like hitting the database on each request is inevitable.Anthurium
@Anthurium I imagine that would be a case where the application downgrading an entity from ADMIN to REGULAR_USER would (in the same process) also need to revoke the appropriate token. i.e. if we know the claims are going to change, we don't wait for expiry, we revoke immediatelyKingery
L
6

Assume you make the access_token last very long, and don't have refresh_token, so in one day, hacker get this access_token and he can access all protected resources!

But if you have refresh_token, the access_token's live time is short, so the hacker is hard to hack your access_token because it will be invalid after short period of time. Access_token can only be retrieved back by using not only refresh_token but also by client_id and client_secret, which hacker doesn't have.

Leanora answered 3/10, 2018 at 4:59 Comment(1)
"by using not only refresh_token but also by client_id and client_secret, which hacker doesn't have." 1. assume it's only access token, then doesn't hacker still need client_id and client_secret? 2. if a hacker is a good hacker then he can hack the client_id and client_secret as well. Regardless that part, hacking additional things shouldn't matter to the comparison, because if it's difficult to hack then it's also difficult to hack for the case of only using access token...long story short, you're not comparing identical situations. You're mixing themPergrim
R
4

While refresh token is retained by the Authorization server. Access token are self-contained so resource server can verify it without storing it which saves the effort of retrieval in case of validation. Another point missing in discussion is from rfc6749#page-55

"For example, the authorization server could employ refresh token rotation in which a new refresh token is issued with every access token refresh response.The previous refresh token is invalidated but retained by the authorization server. If a refresh token is compromised and subsequently used by both the attacker and the legitimate client, one of them will present an invalidated refresh token, which will inform the authorization server of the breach."

I think the whole point of using refresh token is that even if attacker somehow manages to get refresh token, client ID and secret combination. With subsequent calls to get new access token from attacker can be tracked in case if every request for refresh result in new access token and refresh token.

Responsiveness answered 17/11, 2017 at 15:14 Comment(1)
I think this is a very important point :-) It also - to some degree - kind of invalidates the argument here auth0.com/docs/tokens/refresh-token/current#restrictions that A Single-Page Application (normally implementing Single-Page Login Flow) should not under any circumstances get a Refresh Token. The reason for that is the sensitivity of this piece of information. You can think of it as user credentials, since a Refresh Token allows a user to remain authenticated essentially forever. Therefore you cannot have this information in a browser, it must be stored securely.Rhondarhondda
T
4

Its all about scaling and keeping your resource server stateless.

  • Your server / resource server
    • Server is stateless, meaning does not check any storage to respond very fast. Does this by using a public key to verify the signature of the token.

    • Checks access_token on every single request.

    • By only checking the signature and expiration date of access_token, response is very fast and allows scaling.

    • access_token should have short expiration time (a few minutes), since there is no way to revoke it, if it gets leaked damage is limited.

  • Authentication server / OAuth server
    • Server is not stateless, but its ok because requests are much fewer.
    • Checks refresh_token only when access_token is expired. (every 2 minutes for example)
    • Request rate is much lower than resource server.
    • Stores the refresh token in a DB and can revoke it.
    • refresh_token can have long expiration time (few weeks/months), if it gets leaked there is a way to revoke it.

There is a important note though, the authentication server has much fewer requests so can handle load, however there can be a storage issue since it has to store all refresh_tokens, and if users increase dramatically this could become a problem.

Tion answered 20/4, 2022 at 0:43 Comment(2)
But still access token can be used for both? On the resource server, it only suffices to reject user's access token when it's reached its expiry (evil user may not tamper this expiration time b/c resource server has public key and can check the signature and "honesty" of data in the access token), and once the access token is rejected, the user must give it to the auth server to extend the expiry time of the same access token if it is not revoked yet, or the user has not had any suspicious IP / location change, etc.Spindrift
@Spindrift yeah its used on both the resource and authentication server as you described very well.Tion
A
3

Let's consider a system where each user is linked to one or more roles and each role is linked to one or more access privileges. This information can be cached for better API performance. But then, there may be changes in the user and role configurations (for e.g. new access may be granted or current access may be revoked) and these should be reflected in the cache.

We can use access and refresh tokens for such purpose. When an API is invoked with access token, the resource server checks the cache for access rights. IF there is any new access grants, it is not reflected immediately. Once the access token expires (say in 30 minutes) and the client uses the refresh token to generate a new access token, the cache can be updated with the updated user access right information from the DB.

In other words, we can move the expensive operations from every API call using access tokens to the event of access token generation using refresh token.

Asymmetric answered 14/1, 2017 at 13:8 Comment(0)
B
2

From what I understand, refresh tokens are there just for performance and cost savings if you need to be able to revoke access.

Eg 1: Do not implement refresh tokens; implement just long-lived access tokens: You need to be able to revoke access tokens if the user is abusing the service (eg: not paying the subscription) => You would need to check the validity of the access token on every API call that requires an access token and this will be slow because it needs a DB look-up (caching can help, but that's more complexity).

Eg 2: Implement refresh tokens and short-lived access tokens: You need to be able to revoke access tokens if the user is abusing the service (eg: not paying the subscription) => The Short-lived access tokens will expire after a short white (eg. 1hr) and the user will need to get a new access token, so we don't need validation on every API call that requires an access token. You just need to validate the user when generating the access token from the refresh token. For a bad user, you can log out the user if an access token cannot be generated. When the user tries to log back in, the validation will run again and returns an error.

Burrows answered 16/6, 2021 at 4:46 Comment(0)
D
2

Refresh tokens and Access tokens are mere terminologies.

This little analogy can help solidify the rationale behind using Access Tokens and Refresh Tokens:

Suppose Alice sends a cheque to Bob via post, which can be encashed within 1 hour (hypothetical) from the time of issue, else the bank will not honor it. But Alice has also included a note in the post meant for the bank, asking the bank to accept and encash the cheque in case it gets a bit delayed (within a stipulated range)

When Bob receives this cheque, he will himself discard this cheque, if he sees this tampered (token tampering). If not, he can take it to the bank to encash it. Here, when the bank notices that the time of issue has surpassed the 1-hour time limit, but sees a signed note from Alice asking the bank to encash in case of stipulated delay within an acceptable range.

Upon seeing this note, the bank tries to verify the signed message and checks if Alice still has the right permissions. If yes, the bank encashes the cheque. Bob can now acknowledge this back to Alice.

Although not terribly accurate, this analogy can you help notice the different parts involved in processing the transaction:

  • Alice (Sender - Client)
  • Bob (Receiver - Resource Server)
  • Bank (Authorization Server)
  • Verification Process (Database Access)
  • Cheque (Access Token)
  • Note (Refresh Token)

Mainly, we want to reduce the number of API calls to the Auth Server, and eventually to the Database, in order to optimize scalability. And we need to do this with the right balance between convenience and security.

Note: It's certainly more common to have the Auth server responding to the requests earlier than the resource server in the chain.

Decca answered 22/10, 2021 at 8:3 Comment(0)
T
2

Since the refresh and access tokens are terms loaded with a lot of semantics a terminology shift could help?

  • Revokable Tokens - tokens that must be checked with authorization server
    • could be chained (see RTR - refresh token rotation)
    • can be used to create NonRevokable Tokens, but can also be used directly (when volumes are small and the check doesn't become a burden)
    • can be long lived but that depends on how often the user must be bothered with credentials (username/password) to get a new one
    • can be invalidated on RTR or any other suspect behavior
  • NonRevokable Tokens - tokens that are self contained and do not need to be checked with authorization server
    • are useful for big data, distributed servers/api calls to scale horizontally
    • should be short lived (since are non revokable)

In 2020 it become accepted that refresh token can also exist in the browser (initially was offered for backend systems) - see https://pragmaticwebsecurity.com/articles/oauthoidc/refresh-token-protection-implications. Because of this the focus was switched from the "refreshability" (how would a backend in absence of a user prolong the access to an api) to "revokability".

So, to me it looks safer to read the refresh tokens as Revokable Tokens and access tokens as Non-Revokable Tokens (maybe Fast Expiring Non Revokable Tokens) .

As a side note about good practice in 2021 a system can always start with revokable tokens and move to non-revokable when the pressure on authorization server increases.

Tracery answered 24/1, 2022 at 14:42 Comment(0)
D
2

I got some additional resources here which clarify certain things on why we need refresh_token. Some of the key points of these resources are following:

  • In real world, it's better to have separate servers called authServer and resourceServer/s
    • authServer - used only for authentication and authorization. This server's responsibility is to issue refresh token, access token and also login & logout the users
    • resourceServer - this server (can also be multiple servers load balanced) provides protected data. This data can be like products, reviews and so on in an e-commerce-project for example
  • One of the uses of refresh_token is, we don't have to send username and password (credentials) over the wire (from front-end to authServer) every time when we need a new access_token. This should only be done first time (when you don't have refresh_token yet), and the refresh_token will get you the new access_token now on, from authServer so you can continue to make requests to your protected resourceServer. This advantage here is, the users don't have to provide credentials every time because of which the users' username and password is not easily compromised.
  • The other main use of refresh_token is that, let's say your authServer is very protected compared to resourceServer in real world (third party services like auth0, okta, azure and so on or your own implementation). You will only send your access_token to the resourceServer (to get data) and you will never have to send refresh_token to resourceServer . So there is a good chance that your access_token when sent to resourceServer, there is a hacker intercepting your resourceServer (since it's not that secure like authServer) who gets access to your short lived access_token.
    • For this reason, the access_token is has short life span (like 30 minutes). Remember, when this access_token expires, you will send refresh_token to authServer (which is very secure than resourceServer) to get a new access_token. Since you are not sending the refresh_token to the resourceServer at any time, there is no way that the hacker who is intercepting the resourceServer gets your refresh_token. If you, as a developer, still get a doubt that your users' refresh_token might also be hacked, then you can logout all users (make refresh_token invalid for all users) and that way the users will login again (providing username and password) to get a new refresh_token + access_token and things will get on track again.

Some useful resources

Workflow with and without refresh token - Youtube

JWT authentication code example - Node JS - Youtube

Drop answered 10/10, 2022 at 23:39 Comment(0)
H
2

I am not happy with any of the highly ranked answers so I'll toss in my own.

Access tokens are meant to be passed around. To one or more backends (e.g. StackOverflow), who in turn may forward it to other services etc (because micro services are all the rage), and if any of those is careless and logs or otherwise exposes your token, or is actively malicious - you're in trouble. So you make access tokens short lived to limit the blast radius.

But, you have a separate (longer lived) token that you send to no one but the single service you already trust - the same service that issued you your access token to begin with (say Google, or whatever you used to log into StackOverflow). And you use that token to get new short lived tokens to again pass around to third parties that you trust less.

This all means that if your own device or app or you yourself are compromised, the refresh token serves no purpose at all (as is generally true for any other authentication mechanism as well). But if it's StackOverflow that is compromised, only your short lived access token is exposed, because StackOverflow never gets to see your long lived refresh token. And the whole drama about the frequency of use of each token is a red herring at best. It's all about who gets to see which token.

Harass answered 1/9, 2023 at 21:16 Comment(0)
U
1

First, the client authenticates with the authorization server by giving the authorization grant.

Then, the client requests the resource server for the protected resource by giving the access token.

The resource server validates the access token and provides the protected resource.

The client makes the protected resource request to the resource server by granting the access token, where the resource server validates it and serves the request, if valid. This step keeps on repeating until the access token expires.

If the access token expires, the client authenticates with the authorization server and requests for a new access token by providing refresh token. If the access token is invalid, the resource server sends back the invalid token error response to the client.

The client authenticates with the authorization server by granting the refresh token.

The authorization server then validates the refresh token by authenticating the client and issues a new access token, if it is valid.

Unsnarl answered 7/9, 2017 at 15:33 Comment(1)
This doesn't actually mention where the refresh token originates from. I'm assuming the second paragraph should say access token + refresh token ?Rhondarhondda
S
0

Sometimes a user's access token might get stolen without the user knowing anything about it. Since user is unaware of the attack, they will not be able to inform us manually. Then, there will be a huge difference between e.g. 15 minutes and a whole day, with regards to the amount of time(opportunity) we have given the attacker to accomplish its attacks. So this is the reason we need to "refresh" access tokens ourselves every "short period of time" (e.g. every 15 minutes), we don't want to put off doing this for a long time (e.g. a whole day). So what the OP has said in the question is obviously not an option (stretching access token's expiry time as long as refresh token's).

We have at least these three options (that I can think of right now):

  1. Asking each user to re-enter their credentials every short period of time in order to give them fresh access tokens. But obviously, this is not a popular option as it would be bothering to the users.

  2. Instead of using a separate token for refreshing, update the expiry time of the same access token, and once a duplicate request for expiry extension is made, count it as a potential token theft, invalidate the token, and make the user re-prove themselves. This way however, has also a big drawback, in case of an access token's theft, we will be unaware of the theft as long as the real owner has not re-visited our service (which could be even "days"), and the attacker can extend the stolen access token as long as we allow it. Now compare this to the next option, which is using a "refresh token".

  3. Using a second token, called a "refresh token", is our next option. Adding this token specifically for the purpose of extending a user's login has at least these three benefits: 1. No need to send this more sensitive token as frequently as the access token is sent (which is "every" request to protected endpoints). 2. We can request more information "in addition to the refresh token" for issuing new access tokens, e.g., a client secret of the API server. So even if the refresh token is stolen, the attacker will not be able to request new access tokens with it. 3. We're implicitly telling developers that refresh token is "the more sensitive token" that they must pay more attention to its security, e.g., by keeping it in a way that remains more inaccessible to attackers' client side code, for instance, in cookies with httpOnly tags.

An HttpOnly Cookie is a tag added to a browser cookie that prevents client-side scripts from accessing data. source

Using the HttpOnly flag when generating a cookie helps mitigate the risk of client side script accessing the protected cookie. HttpOnly cookies were first implemented in 2002 by Microsoft Internet Explorer developers for Internet Explorer 6 SP1. source (Thank you IE!)

So while attackers are still able to steal both tokens, we've reduced the risks to some extent, compared to the other options.

Spindrift answered 30/3, 2022 at 12:33 Comment(0)
N
0

Let's say we use access tokens with expiration dates, but no refresh tokens.

When an expired token arrives at the server, the server looks for that user in its database and find the most up-to-date authorizations for that user. Puts them in a new token, and sends it to the client.

So database lookups are limited to only once in a while when the token expires. But the price of this is there's a window where the token isn't necessarily up-to-date.

This is what refresh tokens claim to do, but you can see we don't even need them for this. The purpose of refresh tokens is different. It's this:

Suppose the user wants to log-out. He can just delete his access token from his device. Now this device can no longer make requests or get a new token without signing in. Great.

But, what if a malicious agent managed to steal his access token before it was deleted. He could keep making requests in his name, forever! Since the access tokens would keep getting refreshed.

The purpose of refresh tokens is to prevent this scenario.

Instead of auto-refreshing the token when it's expired, the server asks the user to send the refresh token, which is also stored in the database (maybe with a "still valid" flag). The server checks it and if it's valid it refreshes the access token. The user can then log-out, which sends a command to the database to delete the refresh token from it, or mask it as revoked or whatever. Now the malicious agent can't keep doing requests forever (only until his current access token expires). Even if he has the refresh token, since it's revoked it's useless.

But, this is essentially the same as asking the user to re-enter his credentials! Yes. There's only a minor difference: he doesn't need to actually send them, he sends the refresh token instead. It's basically the same load on the database (storing a refresh token per logged user, and looking them up), but you're not sending the actual password, which means you're not having to annoy the user to input it. That's it. That's all there is to refresh tokens. An alternative credential that the user doesn't have to remember or manually input.

Neutralize answered 15/12, 2023 at 13:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.