Change backend during retry in Varnish 4

Asked 18/2, 2016 at 0:11 Answered 4/10, 2023 at 16:32

I'd like to be able to change the backend on a retry in Varnish 4. We've got this working on a different (older) application using Varnish 3, but I haven't been able to figure it out for v4, nor find much documentation. The setup we want is to have 2 sets of directors - one for the initial request trying a local server in the same datacenter as varnish because this is way faster, and then only if that fails, pick randomly from a different director for servers in other datacenters.

In v3, this was easy:

sub vcl_recv {
    if (req.restarts == 0) {
        set req.backend = defaultdirector;
    } else {
        set req.backend = backupdirector;
    }
}

#Then in vcl_fetch and/or vcl_error something like:
if (beresp.status >= 500 && req.restarts < some_max) {
    return(restart);
}

But now in v4, restart has supposedly been replaced with retry, with the entire documentation being:

In 3.0 it was possible to do return(restart) after noticing that the backend response was wrong, to change to a different backend.

This is now called return(retry), and jumps back up to vcl_backend_fetch.

This only influences the backend fetch thread, client-side handling is not affected.

Yet I still see a few people's example code containing return(restart) rather than return(retry), and not a single example of it working with the retry command.

I understand that varnish should not have to do all of the work in vcl_recv again (such as stripping cookies), since it was only the communication with the backend that failed, so it does make sense to bounce back to the backend fetch rather than redo all the frontend processing, but I get a compile error if I try to change the backend in vcl_backend_fetch. How do I make this work?

Freightage answered 18/2, 2016 at 0:11 Comment(0)

the official documentation is kind of misleading. In fact, restart still exists: you can catch the error in vcl_deliver and set the backend accordingly in vcl_recv using req.backend_hint:

sub vcl_recv {
    if (req.restarts == 0) {
        set req.backend_hint = defaultdirector.backend();
    } else {
        set req.backend_hint = backupdirector.backend();
    }
}

sub vcl_deliver {
    if (resp.status >= 500 && req.restarts < some_max) {
        return(restart);
    }
}

Or, if it is more adequate, you can use retry between vcl_backend_response and vcl_backend_fetch:

sub vcl_backend_fetch {
    if (bereq.retries > 0) {
        set bereq.backend = backupdirector.backend();
    }
}

sub vcl_backend_response {
    if (beresp.status >= 500 && bereq.retries < some_max) {
        return(retry);
    }
}

Gyno answered 4/4, 2016 at 22:6 Comment(5)

Partially correct. As it turns out, the varnish documentation isn't just misleading, it's flat-out wrong. Restart was not replaced by retry. Restart still exists, and functions as you noted - it just has to go in vcl_deliver for v4 rather than vcl_fetch which worked in v3. As for your second suggestion about using vcl_backend_fetch, that doesn't work at all (and was the source of my frustration, since that's what the documentation suggests should work). It generates the compiler error "'req.backend_hint': cannot be set in method 'vcl_backend_fetch'". – Freightage 5/4, 2016 at 16:33

Sorry I made a mistake in the 2nd example (used req instead of bereq when setting the backend), it is now fixed. However, on bereq you should use bereq.backend to set the backend. – Gyno 6/4, 2016 at 20:39

Ok, that at least compiles. I haven't tested to see if it works properly or if the bereq.retries counter is properly incremented (and not reset if the backend changes), but it's a start. Of note, and to hopefully save someone else hours of needless frustration, notice that the syntax in vcl_recv is req.backend_hint = director_name.backend(); but in vcl_backend_fetch, it's bereq.backend = director_name.backend(); – Freightage 8/4, 2016 at 13:39

Indeed, the syntax for director changed in v4. But beware that depending on the director type, the backend() method might take a mandatory parameter. – Gyno 9/4, 2016 at 20:3

Note that the test on bereq.retries is useless: it will stop automatically after max_retries (see varnish-cache.org/docs/4.1/reference/varnishd.html#max-retries) – Valina 15/5, 2017 at 18:29

I'm writing another answer, as there's a case that was not dealt with in @komuta answer. Funny thing, I've added a comment in 2017...

So the missing case is the backend's lack of answer! If so, there's no call to vcl_backend_response as there's by definition no response!

sub vcl_recv {
    set req.backend_hint = defaultdirector.backend();
}
 
sub vcl_backend_fetch {
    // if there was a call to the first director, which failed, then we're back with a (retry) call, now we change the director:
    if (bereq.retries > 0) {
        set bereq.backend = backupdirector.backend();
    }

    return (fetch);
}

sub vcl_backend_response {
    // for example if you don't expect an HTTP 500 error, then let's retry!
    if (beresp.status == 500 && bereq.method == "GET") {
        return (retry);
    }

    [...]

    return (deliver);
}

sub vcl_backend_error {
    // here we've had an error with the backend, possibly a network error, or webserver crash, so we want to retry it!
    return (retry);
}

sub vcl_synth {
    // this is where we'll end when bereq.retries will hit max_retries (defaults to 4)

    set resp.body = {"oops..."}

    return (deliver);
}

Valina answered 4/10, 2023 at 16:32 Comment(0)

Recommended topics

Hot tags