HATEOAS: absolute or relative URLs?
Asked Answered
C

10

93

In designing a RESTful Web Service using HATEOAS, what are the pros and cons of showing a link as a complete URL ("http://server:port/application/customers/1234") vs. just the path ("/application/customers/1234")?

Cloak answered 10/2, 2010 at 18:39 Comment(0)
M
92

There is a subtle conceptual ambiguity when people say "relative URI".

By RFC3986's definition, a generic URI contains:

  URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

  hier-part   = "//" authority path-abempty
              / path-absolute
              / path-rootless
              / path-empty

     foo://example.com:8042/over/there?name=ferret#nose
     \_/   \______________/\_________/ \_________/ \__/
      |           |            |            |        |
   scheme     authority       path        query   fragment

The tricky thing is, when scheme and authority are omitted, the "path" part itself can be either an absolute path (starts with /) or a "rootless" relative path. Examples:

  1. An absolute URI or a full URI: "http://example.com:8042/over/there?name=ferret"
  2. And this is a relative uri, with absolute path: /over/there
  3. And this is a relative uri, with relative path: here or ./here or ../here or etc.

So, if the question was "whether a server should produce relative path in restful response", the answer is "No" and the detail reason is available here. I think most people (include me) against "relative URI" are actually against "relative path".

And in practice, most server-side MVC framework can easily generate relative URI with absolute path such as /absolute/path/to/the/controller, and the question becomes "whether the server implementation should prefix a scheme://hostname:port in front of the absolute path". Like the OP's question. I am not quite sure about this one.

On the one hand, I still think server returning a full uri is recommended. However, the server should never hardcode the hostname:port thing inside source code like this (otherwise I would rather fallback to relative uri with absolute path). Solution is server-side always obtaining that prefix from HTTP request's "Host" header. Not sure whether this works for every situations though.

On the other hand, it seems not very troublesome for the client to concatenate the http://example.com:8042 and the absolute path. After all, the client already know that scheme and domain name when it send the request to the server right?

All in all, I would say, recommend to use absolute URI, possibly fallback to relative URI with absolute path, never use relative path.

Mcclendon answered 29/8, 2013 at 8:2 Comment(5)
This is a good answer (+1) which I agree with except the final conclusion. However in my answer I argue that the HTTP spec defines, by example, "absolute" to refer to an absolute path, not a fully qualified URI. So I disagree with your (2) - it is an absolute URI, but one for which the client must infer the network protocol and host, so it's not a fully qualified URI. And, therefore, I also disagree with your definition of (1) which is both a full URI and and absolute URI.Devaluate
Thanks for the comment. I just borrow the absolute path and relative path concept from file system. Different terms apart, I don't see substantial difference between your opinion and mine. You also recommend form 1 & 2, and you against form 3, don't you?Mcclendon
Practically speaking, I am for (2); I think (1) requires the backend to have to much HTTP specific knowledge (meaning about the details of the specific HTTP environment, not HTTP in general), and (3) seems to require too much of the client. But, my reasoning was based on the original draft spec, and the examples were changed in a later version in a way that invalidates my reasoning.Devaluate
Personally, I am not (yet) at all convinced that HATEOAS, and therefore the demand of returning URIs makes all that much sense for an API. I am just not seeing my APIs being driven on the client in a manner akin to browsing a web site; the use cases seem very much driven by adhoc function.Devaluate
@LawrenceDol I have same confusion about HATEOAS at the beginning. Now I consider it as a matter of choice. Your clients can use adhoc function to consume your api for sure, but if they/you want, they/you can still develop a pattern for them to follow, so that the client won't need to hard code each exact url. That is HATEOAS.Mcclendon
M
13

It depends on who is writing the client code. If you are writing the client and server then it doesn't make much difference. You will either suffer the pain of building the URLs on the client or on the server.

However, if you are building the server and you expect other people to write client code then they will love you much more if you provide complete URIs. Resolving relative URIs can be a bit tricky. First how you resolve them depends on the media-type returned. HTML has the base tag, XML can have xml:base tags in every nested element, Atom feeds could have a base in the feed and a different base in the content. If you don't provide your client with explicit information about the base URI then they have to get the base URI from the request URI, or maybe from the Content-Location header! And watch out for that trailing slash. The base URI is determined by ignoring all characters to the right of the last slash. This means that trailing slash is now very significant when resolving relative URIs.

The only other issue that does require a small mention is document size. If you are returning a large list of items where each item may have multiple links, using absolute URLs can add a significant amount of bytes to your entity if you do not compress the entity. This is a performance issue and you need to decide if it is significant on a case by case basis.

Memorable answered 10/2, 2010 at 21:11 Comment(0)
R
11

The only real difference would seem to be that it's easier for clients if they are consuming absolute URIs instead of having to construct them from the relative version. Of course, that difference would be enough to sway me to do the absolute version.

Ritualist answered 10/2, 2010 at 18:47 Comment(0)
R
7

As your application scales, you may wish to do load balancing, fail-over, etc. If you return absolute URIs then your client-side apps will follow your evolving configuration of servers.

Rompers answered 23/11, 2011 at 4:14 Comment(1)
Provided you define "absolute" as absolute path (e.g. /xxx/yyy...) and not as meaning a fully qualified URI (e.g. http://api.example.com/xxx/yyy...).Devaluate
S
6

Using RayLou's trichotomy my organization has opted for favoring (2). The primary reason is to avoid XSS (Cross-Site Scripting) attacks. The issue is, if an attacker can inject their own URL root into the response coming back from the server, then subsequent user requests (such as an authentication request with username and password) can be forwarded to the attacker's own server*.

Some have brought up the issue of being able to redirect requests to other servers for load balancing, but (while that is not my area of expertise) I would wager that there are better ways to enable load balancing without having to explicitly redirect clients to different hosts.

*please let me know if there any flaws in this line of reasoning. The goal, of course, is not to prevent all attacks, but at least one avenue of attack.

Smoke answered 20/5, 2015 at 1:35 Comment(2)
Glad that my previous answer was helpful to your organization. Yes, I personally also prefer (2), a.k.a. scheme-less absolute path. However I'm curious about your reasoning. How did you enforce your client accepting your scheme-less url only? A generic client, such as a browser, would not reject a scheme-less url at all. So I assume you would have to write your own client-side code to validate urls before actually following them? While that is technically doable (but not necessarily useful), this kind of client-side validation is typically not part of REST or HATEOAS discussion.Mcclendon
I know this is an old post, but I just want to point out that "if an attacker can inject their own URL root into the response coming back" is kind of a nonsense reason. If they can "inject their own URL" into the correct places in the response, I bet that they could, just as easily just replace your hostname with their own. So out of a security point of view, I don't see it as a valid argument.Rachmaninoff
S
5

You should always use the full URL. It acts as the unique identifier for the resource since URLs are all required to be unique.

I would also argue that you should be consistent. Since the Location HTTP header expects a full URL based on the HTTP specification, the full URL is sent back in the Location header to the client when a new resource is created. It would be strange for you to provide a full URL in the Location header and then relative URIs in the links within your response body.

Socage answered 3/10, 2013 at 21:43 Comment(7)
Well, the HTTP spec for the Location header says absolute URI. An absolute URI must contain a scheme (e.g. http).Socage
But the question is not how to construct contextless opaque identifiers, it asks how to construct links. The latter may rightly infer "at the same network location as this document", and that's exactly what the spec's example of a Location header gives - an absolute URI which doesn't contain the URI scheme or the server's network location. While links and IDs are often conflated they are not the same thing - the former has context, the latter does not.Devaluate
Can you send a link to the part of the spec you're talking about?Socage
An absolute URI specifies a scheme; a URI that is not absolute is said to be relative. URIs are also classified according to whether they are opaque or hierarchical. An opaque URI is an absolute URI whose scheme-specific part does not begin with a slash character ('/'). Opaque URIs are not subject to further parsing. Some examples of opaque URIs are: mailto:[email protected] news:comp.lang.java urn:isbn:096139210xSocage
Ah, see, I think you're looking at a draft spec. Check this one: w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.30Socage
Ahh, I see you are correct; removed my downvote and deleting comments no longer salient.Devaluate
Hey no worries man. One other point about this stuff is that I've seen people using hrefs as IDs. So that the client doesn't need to reconstruct the URL from some config file and an id, it just knows the URL and can cache based on it.Socage
T
2

One drawback of using absolute URIs is that the api cannot be proxied.

Take it back... not true. You should go for a full URL including the domain.

Tarkington answered 26/11, 2012 at 0:14 Comment(3)
Why can't the absolute URI use the hostname of the proxy?Boodle
Working through this exact issue at the moment. We want all requests to go through a sort of "load-balancing" layer first. Absolute URIs to the servers directly will break this model.Brandiebrandise
I'm using Nginx to proxy a site with absolute URLs. It's perfectly capable of replacing the backend URL with the equivalent proxy URL. Specifically it's proxing windyroad.artifactoryonline.com (which has fully qualified URLs and fully qualified redirects) to repo.windyroad.com.auDeyoung
P
2

An important consideration in large API results is the extra network overhead of including the full URI repeatedly. Believe it or not, gzip does not entirely solve this issue (not sure why). We were shocked at how much space the full URI took up when there were hundreds of links included in a result.

Postern answered 23/1, 2014 at 22:24 Comment(0)
V
2

Regarding the pros, I see the reduction in bytes to be transmitted at the expense of extra handling required by a client for the (absolute) path. If you are desperate to save every byte, even after trying content-encoding as gzip, proper use of caching headers, usage of etags and conditional requests on the client, then this may be necessary in the end, but I expect much higher returns on your efforts elsewere.

Regarding the cons, I see a loss of control regarding how you can direct the flow of clients between resources in the future (load balancing, A/B testing, ...), and I would consider it a bad practice regarding managing a web API. The URL you provide is no longer basically opaque for the client (see Tim Berners-Lee Axioms of Web Architecture on URI opacity). In the end, you become responsible to keep clients happy regarding their creative usage of your API, even if it is only regarding the structure of your URL space. Should you ever need to allow for a explicitly defined URL modification, consider the usage of URI templates as used in the Hypertext Application Language.

Vaporescence answered 10/5, 2019 at 7:33 Comment(0)
L
-1

GET : http://localhost:9090/api/my-service/manage/route/1

UnaryOperator<OutputResponse> addLinkConsumer = (group) -> {
    try {
        return group.add(
                        Link.of(WebMvcLinkBuilder
                                        .linkTo(WebMvcLinkBuilder.methodOn(GroupController.class)
                                                        .getSubById(group.getId()))
                                        .withSelfRel()
                                        .toUri()
                                        .getPath()));
    } catch (Exception e) {
        throw new APIException(DemoRuntimeException.ERROR_IN_LINK_GENERATION,
                        "Failed to generate link for group " + group.getId());
    }
};

Output of HATEOAS with relative path in Java spring boot

   {
        "id": 1,
        "name": "some name",
        "links": [
            {
                "rel": "self",
                "href": "/api/my-service/manage/route/1"
            }
        ]
    },
Loculus answered 3/1, 2023 at 18:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.