http HEAD vs GET performance
Asked Answered
R

10

161

I am setting-up a REST web service that just need to answer YES or NO, as fast as possible.

Designing a HEAD service seems the best way to do it but I would like to know if I will really gain some time versus doing a GET request.

I suppose I gain the body stream not to be open/closed on my server (about 1 millisecond?). Since the amount of bytes to return is very low, do I gain any time in transport, in IP packet number?

Edit:

To explain further the context:

  • I have a set of REST services executing some processes, if they are in an active state.
  • I have another REST service indicating the state of all these first services.

Since that last service will be called very often by a very large set of clients (one call expected every 5ms), I was wondering if using a HEAD method can be a valuable optimization? About 250 chars are returned in the response body. HEAD method at least gain the transport of these 250 chars, but what is that impact?

I tried to benchmark the difference between the two methods (HEAD vs GET), running 1000 times the calls, but see no gain at all (< 1ms)...

Regenerate answered 14/5, 2013 at 9:14 Comment(1)
It also depends on the approach you use server-side. It usually may take the same server time to process a GET request or a HEAD request, because the server might need to know the final body to calculate the Content-Length header value, which is an important information in a response of a HEAD request. Unless there is some other more optimized server-side approach, the only benefit is that bandwidth is saved and the client doesn't have to parse the response body. So basically the optimization gains depend on both server and client implementations.Anoxia
D
227

A RESTful URI should represent a "resource" at the server. Resources are often stored as a record in a database or a file on the filesystem. Unless the resource is large or is slow to retrieve at the server, you might not see a measurable gain by using HEAD instead of GET. It could be that retrieving the meta data is not any faster than retrieving the entire resource.

You could implement both options and benchmark them to see which is faster, but rather than micro-optimize, I would focus on designing the ideal REST interface. A clean REST API is usually more valuable in the long run than a kludgey API that may or may not be faster. I'm not discouraging the use of HEAD, just suggesting that you only use it if it's the "right" design.

If the information you need really is meta data about a resource that can be represented nicely in the HTTP headers, or to check if the resource exists or not, HEAD might work nicely.

For example, suppose you want to check if resource 123 exists. A 200 means "yes" and a 404 means "no":

HEAD /resources/123 HTTP/1.1
[...]

HTTP/1.1 404 Not Found
[...]

However, if the "yes" or "no" you want from your REST service is a part of the resource itself, rather than meta data, you should use GET.

Disenable answered 21/5, 2013 at 0:2 Comment(11)
Wonderful answer! I've got a question: What about about using it as a touch command to update a post's view count on the server? The post data has already been retrieved via a normal /posts call, so I just want to update the view count after the user interacts with the post in some way.Reorganize
@Reorganize If you are going to update a view counter for HEAD requests, then you should do so for GET requests also. The decision to use GET or HEAD is ultimately up to the HTTP client. Your server should behave the same way for both request types, except there is no response body when responding to HEAD. As for whether this is a good way to implement something like a view counter, I am unsure.Disenable
-1 Any information that can be named can be a resource. Hence Uniform Resource Locator. The idea that using part of the HTTP protocol, as designed, is "kludgey" or "unclean" is bizarre.Earplug
@Earplug I can't make much sense of your comment, but stuffing data in HTTP headers just so it can be retrieved with a HEAD request is not using the protocol as designed.Disenable
@AndreD - Returning the Headers for a resource via head isn't "kludgey", "unclean" or "stuffing data" - in the OPs question they wanted to know about active state - so for example - returning a "404" for no/not found or a "200" for yes/found could be perfectly acceptable, also using "304" Not Modified along with the Etag request header could be a good solution - allowing the client to simply check if the active state has changed since it was last checked by the client. All of this is clearly how the HTTP protocol is designed.Earplug
@Earplug You appear to have misunderstood my answer. Using 200/404 for yes/no was my own suggestion that I stand behind as a good solution. If you want to discuss further, please start a chat.Disenable
@AndreD - You appear to have misunderstood my objection to it.Earplug
Unless the resource is large or is slow to retrieve at the server, you might not see a measurable gain by using HEAD instead of GET. I have a Q. For a HEAD call, won't the resource have to be fetched whether large on not in order to compute Content-length? So both GET and HEAD would take the same time on the server-side? Unless the server manages to compute content length without an expensive resource-fetch...Capet
@Siddhartha, that's very often true, but not always. Content-Length can be omitted when using Transfer-Encoding: chunked. Even with Content-Length, it's possible that the server can get the resource size and other metadata used in headers without fetching the actual resource. Maybe that metadata is even cached in memory for very fast access. That's all very implementation specific.Disenable
@AndreD Couldn't have asked for a more elaborate answer, thank you!Capet
perfect and simple answer!!Vanna
T
55

I found this reply when looking for the same question that requester asked. I also found this at http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html:

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.

It would seem to me that the correct answer to requester's question is that it depends on what is represented by the REST protocol. For example, in my particular case, my REST protocol is used to retrieve fairly large (as in more than 10K) images. If I have a large number of such resources being checked on a constant basis, and given that I make use of the request headers, then it would make sense to use HEAD request, per w3.org's recommendations.

Tetrad answered 3/9, 2014 at 20:31 Comment(0)
L
37

GET fetches head + body, HEAD fetches head only. It should not be a matter of opinion which one is faster. I don't understand the upvoted answers above. If you are looking for META information then go for HEAD, which is meant for this purpose.

Lysozyme answered 26/5, 2017 at 11:38 Comment(4)
Well, when I use HEAD my server send me 404 and when I use GET method, then I receive 200. So that difference you described is not right and it probably also depends on the application.Brook
@Brook Something is definitely wrong with your serverKassi
@Rohlik: If your server is not picking the HEAD request, as in, it doesn't have an endpoint for HEAD on a particular URL for which you have a GET, then it will give you the $)$ for HEAD and the 200 for GET.Reggy
@Viktor, from the simple performance standpoint, you are right. However, it might have escaped you that the discussion went very rapidly into the scope of semantics and "should vs could". Ignoring semantics leads to constant mixups between PUT and POST, or the deeper discussion of GET body. Good semantics get upvotes. Sole focus on performance, not so much.Reggy
C
24

I strongly discourage this kind of approach.

A RESTful service should respect the HTTP verbs semantics. The GET verb is meant to retrieve the content of the resource, while the HEAD verb will not return any content and may be used, for example, to see if a resource has changed, to know its size or its type, to check if it exists, and so on.

And remember : early optimization is the root of all evil.

Chemotherapy answered 22/5, 2013 at 18:35 Comment(0)
F
14

HEAD requests are just like GET requests, except the body of the response is empty. This kind of request can be used when all you want is metadata about a file but don't need to transport all of the file's data.

Flareup answered 10/8, 2019 at 5:17 Comment(2)
Well, when I use HEAD my server send me 404 and when I use GET method, then I receive 200. So that difference you described is not right and it probably also depends on the application.Brook
may be your server doest not configured head request it only accepts get requestFlareup
U
7

Your performance will hardly change by using a HEAD request instead of a GET request.

Furthermore when you want it to be REST-ful and you want to GET data you should use a GET request instead of a HEAD request.

Untie answered 16/5, 2013 at 16:23 Comment(0)
J
1

I don't understand your concern of the 'body stream being open/closed'. The response body will be over the same stream as the http response headers and will NOT be creating a second connection (which by the way is more in the range of 3-6ms).

This seems like a very pre-mature optimization attempt on something that just won't make a significant or even measurable difference. The real difference is the conformity with REST in general, which recommends using GET to get data..

My answer is NO, use GET if it makes sense, there's no performance gain using HEAD.

Jakob answered 22/5, 2013 at 13:17 Comment(3)
Suppose content is 100MB. Surely head will less than content in size. Now when we request that resource by the GET or HEAD method in your opinion there is no performance difference between them?!Lovemaking
The OP stated 250 chars in the body of the response. Not 100MB. That’s a different question altogether.Jakob
For heavy resources, let's say a image/video/audio storage you may use HEAD to check information without downloading 100MB or more to have the information.Merriment
P
0

You could easily make a small test to measure the performance yourself. I think the performance difference would be negligable, because if you're only returning 'Y' or 'N' in the body, it's a single extra byte appended to an already open stream.

I'd also go with GET since it's more correct. You're not supposed to return content in HTTP headers, only metadata.

Poirier answered 18/5, 2013 at 14:0 Comment(0)
D
0

Short Summary

We can configure a custom HTTP method, CHECK (or EXISTANCE or another more purpose-specific name), and set up caching/proxying for it.


The RESTful API - architectural style

Roy Fielding invented and proposed RESTful API, a form of Software 'architectural style,' not only for clear communication between nodes but also for the overall performance of the web. This is synonymous with 'designing thoroughly with RESTful API' and 'seeking ways to maximize web performance.'

Let's try to design RESTfully. GET signifies requesting resource data and should not be used for metadata like resource existence or view counts. These purposes are distinctly different; therefore, HEAD is more suitable. However, HEAD has the limitation of not using the HTTP response body.

Because of this, @Andre D suggested using 404 Not Found. However, 4xx errors are status codes used when there is Responsibility on the Client Request. Simply put, if the result of checking resource existence is 'yes' or 'no', as intended by the designer, you should not send a 404.

To obtain 'resource metadata' while also receiving a 2xx value and a true/false ("yes" or "no") response, and to satisfy 'RESTful API', it should have the characteristics of both GET (resource) and HEAD (retrieving resource of).

Such an HTTP method was not predicted to be necessary by Roy Fielding at the time of design, so there is no corresponding method.

However, the father of the web even anticipated things he couldn't foresee, hence 'custom HTTP method' is the solution. The solution is simple. Like HEAD, retrieving Metadata but also possessing an HTTP response body like GET, we can create a CHECK (or EXISTANCE) HTTP method.

REST API for Performance Enhancement

Now, why is this structure directly linked to performance? It's simple. It's because caching and proxying can be beautifully structured. Caching and proxying are known as the most overwhelmingly powerful methods for addressing the performance of all requests, including CDN (I think the growth of Redis is an example).

Nowadays, with Nginx's proxy buffering technology or lua scripts, various requests can be bundled into 'identical queries' for an API instance and cached. Even if the API response time is a dreadful 1s, if resource check is performed and cached, it can provide clients with a more than 40 times improved response time of approximately 15ms for the second identical request - a preflight-level speed. The remarkable and satisfying point of this structure is 'it should not have any impact on existing APIs', meaning a beautiful establishment of respect for legacy APIs.

Even for internal or external extensions, proxying (whether forward or reverse) allows for flexible response since the concerns of managing or sharing the original resources for caching are separated.

More Standards: About WebDAV

Still, one problem remains.

If 'I', as well as numerous other developers and those who will join in the future, do not use an agreed-upon language, the compatibility issues when each Vendor uses custom HTTP methods and the steep increase in pointless learning costs are certain.

In other words, we need more and detailed 'standards'.

In fact, a good example is the existence of the COPY method. This is the COPY method defined by WebDAV (Web Standard Extension) due to a similar question.

It has the nature of POST, which creates a new resource, but does not require a request body, yet refers to 'an existing resource', showing characteristics of PUT or PATCH (however, PUT, PATCH are requests for an existing resource, not a new one). Therefore, COPY was born as it satisfies neither. This method is defined in WebDAV but is treated as a Custom Http method.

WebDAV is written as an rfc document, positioned at the very top among standard authorities, and, as evident from the name 'Distributed', continues the fundamental philosophy of the Web. Therefore, we should choose RFC > WebDAV RFC > each language's Standard Spec (JPA, ECMAScript Spec, etc.) > not a standard but as intuitive as possible and short, popular words (COPY, CHECK).

Dom answered 11/12, 2023 at 10:50 Comment(0)
V
0

TL;DR

  • The HTTP HEAD verb simply returns metadata about a resource on the server.
    This HTTP verb returns all of the headers associated with a resource at a given URL, but does not actually return the resource itself. Something like file's metadata in filesystem.

  • The HTTP GET verb returns the actual resource on the server.

Virile answered 8/3, 2024 at 10:25 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.