Varnish and ESI, how is the performance?

Asked 11/5, 2011 at 7:17 Answered 17/4, 2018 at 23:6

Solved caching reverse-proxy varnish server-side-includes edge-side-includes

Im wondering how the performance of th ESI module is nowadays? I've read some posts on the web that ESI performance on varnish were actually slower than the real thing.

Say i had a page with over 3500 esi includes, how would this perform? is esi designed for such usage?

Hayrick answered 11/5, 2011 at 7:17 Comment(5)

I can think of a way to find out! Make a page with 3500 includes and benchmark it, I for one would be very interested in the results :) – Foretopmast 20/5, 2011 at 8:31

I would gladly do it, but im very new to varnish and i think such a benchmark should be performed by a pro. – Hayrick 20/5, 2011 at 8:57

Why do you want 3500 includes in one page? Just trying to imagine such use case – Socio 22/5, 2011 at 12:19

I was thinking along the lines of json documents. specifically large documents. where one could link different "subdocuments" together with esi:includes. say you have a document that gives you a list of employees, but it only gives you the ID of the employee and nothing more. Then with ESI you could make it include the employee information based on the ID. – Hayrick 22/5, 2011 at 12:32

I'd probably include the fetch as a single request to the necessary list of employees instead of making it an iteration over each one. – Isologous 22/9, 2011 at 12:38

We're using Varnish and ESI to embed sub-documents into JSON documents. Basically a response from our app-server looks like this:

[
  <esi:include src="/station/best_of_80s" />,
  <esi:include src="/station/herrmerktradio" />,
  <esi:include src="/station/bluesclub" />,
  <esi:include src="/station/jazzloft" />,
  <esi:include src="/station/jahfari" />,
  <esi:include src="/station/maximix" />,
  <esi:include src="/station/ondalatina" />,
  <esi:include src="/station/deepgroove" />,
  <esi:include src="/station/germanyfm" />,
  <esi:include src="/station/alternativeworld" />
]

The included resources are complete and valid JSON responses on their own. The complete list of all stations is about 1070. So when the cache is cold and a complete station list is the first request varnish issues 1000 requests on our backend. When the cache is hot ab looks like this:

$ ab -c 100 -n 1000 http://127.0.0.1/stations
[...]

Document Path:          /stations
Document Length:        2207910 bytes

Concurrency Level:      100
Time taken for tests:   10.075 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      2208412000 bytes
HTML transferred:       2207910000 bytes
Requests per second:    99.26 [#/sec] (mean)
Time per request:       1007.470 [ms] (mean)
Time per request:       10.075 [ms] (mean, across all concurrent requests)
Transfer rate:          214066.18 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1   11   7.3      9      37
Processing:   466  971  97.4    951    1226
Waiting:        0   20  16.6     12      86
Total:        471  982  98.0    960    1230

Percentage of the requests served within a certain time (ms)
  50%    960
  66%    985
  75%    986
  80%    988
  90%   1141
  95%   1163
  98%   1221
  99%   1229
 100%   1230 (longest request)
$

100 rec/sec doesn't look that good but consider the size of the document. 214066Kbytes/sec oversaturates a 1Gbit interface well.

A single request with warm cache ab (ab -c 1 -n 1 ...) shows 83ms/req.

The backend itself is redis based. We're measuring a mean response time of 0.9ms [sic] in NewRelic. After restarting Varnish the first request with a cold cache (ab -c 1 -n 1 ...) shows 3158ms/rec. This means it takes Varnish and our backend about 3ms per ESI include when generating the response. This is a standard core i7 pizza box with 8 cores. I measured this while being under full load. We're serving about 150mio req/month this way with a hitrate of 0.9. These numbers suggest indeed that the ESI-includes are resolved in serial.

What you have to consider when designing a system like this is 1) that your backend is able to take the load after a Varnish restart when the cache is cold and 2) that usually your resources don't expire all at once. In case of our stations they expire every full hour but we're adding a random value of up to 120 seconds to the expiration header.

Hope that helps.

Morphine answered 28/3, 2012 at 19:51 Comment(2)

I just remebered we had issues with the 3.0.0 release of Varnish and more than 250 ESI includes. Be sure to use 3.0.2 or up. – Morphine 28/3, 2012 at 19:57

Very good point on expiration time, your solution there is very clever. Definitely stealing that! – Foretopmast 9/7, 2012 at 10:29

This isn't first-hand, but I'm led to believe that Varnish's current ESI implementation serialises include requests; i.e., they're not concurrent.

If that's the case, it would indeed suck for performance in the case you mention.

I'll try to get someone with first-hand experience to comment.

Hypoploid answered 23/5, 2011 at 6:29 Comment(3)

That is actually true, it is a known deficiency in varnish implementation of ESI – Biology 7/2, 2013 at 20:36

I can confirm it on a practical test: with a php Symfony 2.6 installation: a simple main controller/view which renders (with render_esi) two "child" controllers, each of which has a "sleep(n)" pause, is going to render the pages sequentially in this way: First, it renders the parent view. Then, it renders the first esi:include tag. Then it renders the second. In my example, only the parent page has shared-max-age of 600, and as expected, the two childs are never cached. I expected them to be fetched concurrently, but they actually are fetched sequentially. – Flooded 12/12, 2014 at 18:14

cernio - what version of varnish? – Hypoploid 13/12, 2014 at 21:44

Parallel ESI requests are available in the **commercial** version of varnish: https://www.varnish-software.com/plus/parallel-esi/. The parallel nature of the fragment requests apparently makes the assembly of a page comprised of multiple fragments faster.

(this would be a comment but I have insufficient reputation to do that)

Stowers answered 17/4, 2018 at 23:6 Comment(0)

Recommended topics

Hot tags