HaProxy Transparent Proxy To AWS S3 Static Website Page
Asked Answered
H

1

5

I am using haproxy to balance a cluster of servers. I am attempting to add a maintenance page to the haproxy configuration. I believe I can do this by defining a server declaration in the backend with the 'backup' modifier. Question I have is, how can I use a maintenance page hosted remotely on AWS S3 bucket (static website) without actually redirecting the user to that page (i.e. the haproxy server 'redir' definition).

If I have servers: a, b, c. All servers go down for maintenance then I want all requests to be resolved by server definition d (which is labeled with 'backup') to a static address on S3. Note, that I don't want paths to carry over and be evaluated on s3, it should always render the static maintenance page.

Higgledypiggledy answered 12/4, 2016 at 23:59 Comment(0)
B
8

This is definitely possible.

First, declare a backup server, which will only be used if the non-backup servers are down.

server s3-fallback example.com.s3-website-us-east-1.amazonaws.com:80 backup

The following configuration entries are used to modify the request or the response only if we're using the alternate path. We're using two tests in the following examples:

# { nbsrv le 1 } -- if the number of servers in this backend is <= 1
# (and)
# { srv_is_up(s3-fallback) } -- if the server named "s3-fallback" is up; "server name" is the arbitrary name we gave the server in the config file
# (which would mean it's the "1" server that is up for this backend)

So, now that we have a backup back-end, we need a couple of other directives.

Force the path to / regardless of the request path.

http-request set-path / if { nbsrv le 1 } { srv_is_up(s3-fallback) }

If you're using an essentially empty bucket with an error document, then this isn't really needed, since any request path would generate the same error.

Next, we need to set the Host: header in the outgoing request to match the name of the bucket. This isn't technically needed if the bucket is named the same as the Host: header that's already present in the request we received from the browser, but probably still a good idea. If the bucket name is different, it needs to go here.

http-request set-header host example.com if { nbsrv le 1 } { srv_is_up(s3-fallback) }

If the bucket name is not a valid DNS name, then you should include the entire web site endpoint here. For a bucket called "example" --

http-request set-header host example.s3-website-us-east-1.amazonaws.com if { nbsrv le 1 } { srv_is_up(s3-fallback) }

If your clients are sending you their cookies, there's no need to relay these to S3. If the clients are HTTPS and the S3 connection is HTTP, you definitely wat to strip these.

http-request del-header cookie if { nbsrv le 1 } { srv_is_up(s3-fallback) }

Now, handling the response...

You probably don't want browsers to cache the responses from this alternate back-end.

http-response set-header cache-control no-cache if { nbsrv le 1 } { srv_is_up(s3-fallback) }

You also probably don't want to return "200 OK" for these responses, since technically, you are displaying an error page, and you don't want search engines to try to index this stuff. Here, I've chosen "503 Service Unavailable" but any valid response code would work... 500 or 502, for example.

http-response set-status 503 if { nbsrv le 1 } { srv_is_up(s3-fallback) }

And, there you have it -- using an S3 bucket website endpoint as a backup backend, behaving no differently than any other backend. No browser redirect.

You could also configure the request to S3 to use HTTPS, but since you're just fetching static content, that seems unnecessary. If the browser is connecting to the proxy with HTTPS, that section of the connection will still be secure, although you do need to scrub anything sensitive from the browser's request, since it will be forwarded to S3 unencrypted (see "cookie," above).

This solution is tested on HAProxy 1.6.4.


Note that by default, the DNS lookup for the S3 endpoint will only be done when HAProxy is restarted. If that IP address changes, HAProxy will not see the change, without additional configuration -- which is outside the scope of this question, but see the resolvers section of the configuration manual.


I do use S3 as a back-end server behind HAProxy in several different systems, and I find this to be an excellent solution to a number of different issues.

However, there is a simpler way to have a custom error page for use when all the backends are down, if that's what you want.

errorfile   503 /etc/haproxy/errors/503.http

This directive is usually found in global configuration, but it's also valid in a backend -- so this raw file will be automatically returned by the proxy for any request that tries to use this back-end, if all of the servers in this back-end are unhealthy.

The file is a raw HTTP response. It's essentially just written out to the client as it exists on the disk, with zero processing, so you have to include the desired response headers, including Connection: close. Each line of the headers and the line after the headers must end with \r\n to be a valid HTTP response. You can also just copy one of the others, and modify it as needed.

These files are limited by the size of a response buffer, which I believe is tune.bufsize, which defaults to 16,384 bytes... so it's only really good for small files.

HTTP/1.0 503 Service Unavailable\r\n
Cache-Control: no-cache\r\n
Connection: close\r\n
Content-Type: text/plain\r\n
\r\n
This site is offline.

Finally, note that in spite of the fact that you're wanting to "transparently proxy a request," I don't think the phrase "transparent proxy" is the correct one for what you're trying to do, because a "transparent proxy" implies that either the client or the server or both would see each other's IP addresses on the connection and think they were communicating directly, with no proxy in between, because of some skullduggery done by the proxy and/or network infrastructure to conceal the proxy's existence in the path. This is not what you're looking for.

Besant answered 13/4, 2016 at 14:15 Comment(1)
There may be an issue with this answer requiring a modification to the logic used when determining whether the backup server is active. Further testing suggests that the behavior of the nbsrv fetch does not seem quite as expected. It's unclear to me so far but this value may only count non-backup servers, so nbsrv eq 0 may actually be correct, leading to incorrect behavior when 1 non-backup back-end server remains online. Will update after further review.Besant

© 2022 - 2024 — McMap. All rights reserved.