What is a RESTful way of monitoring a REST resource for changes?
Asked Answered
D

3

43

If there is a REST resource that I want to monitor for changes or modifications from other clients, what is the best (and most RESTful) way of doing so?

One idea I've had for doing so is by providing specific resources that will keep the connection open rather than returning immediately if the resource does not (yet) exist. For example, given the resource:

/game/17/playerToMove

a "GET" on this resource might tell me that it's my opponent's turn to move. Rather than continually polling this resource to find out when it's my turn to move, I might note the move number (say 5) and attempt to retrieve the next move:

/game/17/move/5

In a "normal" REST model, it seems a GET request for this URL would return a 404 (not found) error. However, if instead, the server kept the connection open until my opponent played his move, i.e.:

PUT /game/17/move/5

then the server could return the contents that my opponent PUT into that resource. This would both provide me with the data I need, as well as a sort of notification for when my opponent has moved without requiring polling.

Is this sort of scheme RESTful? Or does it violate some sort of REST principle?

Dasyure answered 2/1, 2009 at 3:26 Comment(2)
"What's the rest way to do this, Scrappy?"Ricotta
You can use long polling, or combine REST with a websocket service, which sends the events to the client.Ideate
C
30

Your proposed solution sounds like long polling, which could work really well.

You would request /game/17/move/5 and the server will not send any data, until move 5 has been completed. If the connection drops, or you get a time-out, you simply reconnect until you get a valid response.

The benefit of this is it's very quick - as soon as the server has new data, the client will get it. It's also resilient to dropped connections, and works if the client is disconnected for a while (you could request /game/17/move/5 an hour after it's been moved and get the data instantly, then move onto move/6/ and so on)

The issue with long polling is each "poll" ties up a server thread, which quickly breaks servers like Apache (as it runs out of worker-threads, so can't accept other requests). You need a specialised web-server to serve the long-polling requests.. The Python module twisted (an "an event-driven networking engine") is great for this, but it's more work than regular polling..

In answer to your comment about Jetty/Tomcat, I don't have any experience with Java, but it seems they both use a similar pool-of-worker-threads system to Apache, so it will have that same problem. I did find this post which seems to address exactly this problem (for Tomcat)

Charily answered 2/1, 2009 at 4:39 Comment(2)
I'm using Jetty as a Java servlet container. It seems to work just fine for "long polling". Does it have the same problems as Apache (namely, running out of worker threads)? What about Tomcat?Dasyure
To avoid tying up threads, you can use an asp.net Asynchronous HTTP handler. This gives the thread back to the thread pool.Finicky
O
4

I'd suggest a 404, if your intended client is a web browser, as keeping the connection open can actively block browser requests in the client to the same domain. It's up to the client how often to poll.


2021 Edit: The answer above was in 2009, for context.

Today, I would suggest using a WebSocket interface with push notifications.

Alternatively, in the above suggestion, I might suggest holding the connection for 500-1000ms and check twice at the server before returning the 404, to reduce the overhead of creating multiple connections at the client.

Orosco answered 2/1, 2009 at 3:54 Comment(6)
That's kind of a misuse of the error (how do you detect an actual 404 because you requested /game/x14 not /game/14).. Returning {'error':'no new content'} or something would be less problematic..Charily
no, because until step 14 has been made, the resource doesn't exist... it is a proper 404.Orosco
I would suggest responding 503 instead, if the expected use case is that it will appear in the future. By spec, 404 should NOT be retried later automatically.Osis
@MikkoRantalainen it's not a 503, the "Service" is the webserver, 503 is generally the error code for a reverse proxy to a down resource behind it, or a database server down, etc... 404 is a non-existant resource, which is what this is. If step 14 never happens, it never exists. A 404 means the resource doesn't exist, it does not imply the resource will never exist.Orosco
I agree that 404 seems better because it's logically "not found". However, according to HTTP spec, 404 must not be retried automatically (4xx series has generic definition "The client SHOULD NOT repeat the request without modifications."). Of course, if you have custom service and custom client, you can bend the rules as you wish. I guess one option would be to respond 404 + Expires: <suitable time in future> to express that the resource doesn't exist right now but may exist after expiry time. I think the exact interpretation of that is not fully defined, though.Osis
@MikkoRantalainen FYI: SHOULD NOT is not the same as WILL NOT in terms of specifications. In the case above, it would be by design.Orosco
D
2

I found this article proposing a new HTTP header, "When-Modified-After", that essentially does the same thing--the server waits and keeps the connection open until the resource is modified.

I prefer a version-based approach rather than a timestamp-based approach, since it's less prone to race conditions and gives you a little more information about what it is you're retrieving. Any thoughts to this approach?

Dasyure answered 2/1, 2009 at 3:43 Comment(1)
The version-based approach would use the ETag HTTP header. en.wikipedia.org/wiki/HTTP_ETagMaking

© 2022 - 2024 — McMap. All rights reserved.