How do we cache HTML "fragments"?

Asked 14/9, 2011 at 14:51 Answered 14/9, 2011 at 19:8

I have a page which looks like this:

<!doctype html>
<head></head>
<body>
    <div>Content 1000 chars</div>
    <div>Content 1000 chars</div>
    <div>Content 1000 chars</div>
</body>
</html>

When a client downloads the page, basically he's downloading 3100 characters. If he visits the page again and the contents of the first div changes, he will have to redownload the entire page again (3100 characters).

Now basically I was wondering are we able to cache HTML fragments like the way we do with images?

So I was thinking is there somewhere to get this effect:

 <!doctype html>
<head></head>
<body>
    <div src="page1.html"></div>
    <div src="page2.html"></div>
    <div src="page3.html"></div>
</body>
</html>

So if I were to change the contents of page1.html, the browser would be able to know that only page1.html was changed since the last visit, and downloads 1000 characters instead of the entire page (3100) characters. Essentially this behavior is identical to what is happening now with images:

 <!doctype html>
<head></head>
<body>
    <img src="img1.gif">
    <img src="img2.gif">
    <img src="img3.gif">
</body>
</html>

whereby changing img1.gif will invoke the browser to redownload only img1.gif (assuming all the other files have not been edited)

To be clear, I'm not looking for an AJAX solution. I need a solution that works without javascript (as with all the above examples). I'm also not particularly in favor of the frames solution, However I would accept that as answer if there are simply no other alternatives / quirks / hacks

Myer answered 14/9, 2011 at 14:51 Comment(3)

A page is a page, so, no, it's cached as an entire page. Is the issue that people need to refresh the page constantly? If not, you could update the page via AJAX so that only the new 'chunk' of HTML has to be downloaded. Alternatively, if this is data that can be syndicated, send it out as an RSS feed and let people update the status via an RSS reader. – Verbify 14/9, 2011 at 19:12

@Verbify I see where you are coming from. But to be clear, I'm not looking for an AJAX solution. I need a solution that works without javascript – Myer 14/9, 2011 at 20:0

There really isn't one. 'tis the nature of the web. – Verbify 14/9, 2011 at 20:4

Have you thought of IFrames?

However, I think that this is such a micro-optimization, that it wouldn't have any advantage (except caching inside server applicvation, which is a completely other can o'worms).

(Or you're talking about way more than 3000 chars here.)

Edit: There is another solution, but it is not supported in any browser on HTML documents without using AJAX, and only in some server scenarios: HTTP Range requests. You can tell the server with an additional header to return only a certain range of a document:

GET /large-document.html HTTP/1.1
Accept-Range: bytes
Range: bytes=0-500

Response will contain only the first 500 bytes. This technique is used to resume aborted downloads, for example.

But as I said, this doesn't help you in your scenario. For one, no browser supports this without AJAX (or outside the download manager). And for the second, the client has no idea, which range to request, and where to put it in the already fetched document to replace the old part.

If you really need to support legacy browsers down to IE3 and Netscape 2 and even old text-browsers like legacy Lynx versions, use the classic <frameset>, not an <iframe>. It is supported in basically everything since the olden days of Mosaic and was back then specifically designed for this task. (So it was the tool of choice back then when the browsers came out that you seek to support.)

Rimmer answered 14/9, 2011 at 14:54 Comment(9)

Thanks for the reply. I have thought of frames, but do not wish to use them because it is not supported in all browsers. I have 2 versions of my app, one that is for modern browsers, and one that is for antique browsers. This question is a problem I am facing with my "antique browser version", so no frames here would be necessary. – Myer 14/9, 2011 at 15:5

I think it would be reasonable to use iframes even for "antique" browsers. How old are you talking? IE seems to support them since version 4, Netscape since 6+, Mozilla 1+, Opera 5+ (ref). – Katrinka 14/9, 2011 at 15:51

@Matt Yes I do need to support the IE4 Group. Basically I was not not aware that it was that old. I guess I would accept this as an answer if there are no other alternatives / "hacks" – Myer 14/9, 2011 at 18:22

+1 - There is such a thing as <div src="page1.html"></div> only it's called <iframe src="page1.html"></iframe> – Enhance 14/9, 2011 at 19:40

@Rimmer yes the 3x1000 chars is simply to demonstrate the idea. The savings for the actual would be considerable (otherwise I wouldn't have bothered) – Myer 14/9, 2011 at 20:1

@Myer really? bad craziness... I tried looking at most browser breakdowns and it's not even included. wikipedia, which is the only one I found it on, actually lists it at 0% and IE5 even less than a tenth of a percentile. it's for that hardcore group of ppl that want to support everything! :P – Katrinka 14/9, 2011 at 21:0

@Matt I was referring to the IE4 "group", in other words browsers around those era. Browser breakdowns often have a section called others. Every 1% in that mass equates to 21 million people. There are people at 40+ who have no idea what's Google Chrome. and this en.wikipedia.org/wiki/Usage_share_of_web_browsers#Accuracy – Myer 15/9, 2011 at 5:23

See my edited answer. Tl;dr: Use the good ol' <frameset>, it was designed just for this case. – Rimmer 15/9, 2011 at 9:54

By the way: I assumed in my answer, that you are really only concerned about the download time/amount for a very large document. If you really worry about generating the respective block of HTML on server-side, the answer would look totally different. – Rimmer 15/9, 2011 at 9:57

The only way I can think of for achieving what you want without any JavaScript is using frames. There are, however, a number of disadvantages to frames, which you should be aware of before using them in your website.

Israel answered 14/9, 2011 at 14:55 Comment(0)

Modern versions of Firefox and Chrome do this natively - they cache images and code whenever they can. In fact, the only way to get reloads is to clear cache at the browser level.

You might also want to look into reverse-proxy caching, which essentially does what you are doing on a site-wide basis to avoid DB traffic. Varnish is a good option that will cache pages and is highly customizable.

Quaker answered 14/9, 2011 at 18:45 Comment(1)

I am aware that they cache images and code whenever they can, but in the example demonstrated, the file is updated, the browser has no way of knowing the contents of the updated file unless it actually downloads that updated file. The updated file is definitely compressed (as with all files before they are sent over the wire). But beyond that, there's no way anyone could optimize the downloading process except by negotiating a better internet plan – Myer 14/9, 2011 at 20:8

I don't know if doing such optimizations is reasonable. Modern browsers accept data compression (and moderns servers do it), and the text compresses really well. You have to use output buffering (e.g. see ob_start in PHP) so the page won't be sent chunk-by-chunk in tiny pieces by the server, but it will wait some time for output to be ready, then compress it and send to client, and client uncompresses it.

Using frames as a layout technique is highly discouraged nowadays (maybe iframes are sometimes a good solution, but it depends).

Homoeo answered 14/9, 2011 at 19:8 Comment(0)

Recommended topics

Hot tags