Wildcard subdomains point to appropriate S3/CloudFront subdirectories
Asked Answered
S

3

9

I need multiple subdomains to point to individual buckets/subdirectories on Amazon S3 (synched to CloudFront distribution), where I'm hosting some static files.

So that ANY

SUBDOMAINNAME.example.com

automatically points to

s3.amazonaws.com/somebucket/SUBDOMAINNAME

or

somedistributionname.cloudfront.net/SUBDOMAINNAME

Is there a way to accomplish this without running a server for redirection?

Can it be done without changing DNS records for each new subdomain or, if not, adding the DNS rules programmatically?

What is the most efficient way of doing it, in terms of resource usage. (There might be hundreds of subdomains with 100s of daily requests for each)

Stocker answered 4/6, 2016 at 20:41 Comment(0)
F
7

update: this answer was correct when written, and the techniques described below are still perfectly viable but potentially less desirable since Lambda@Edge can now be used to accomplish this objective, as I explained in my answer to Serving a multitude of static sites from a wildcard domain in AWS.


No, there is no way to do this automatically.

Is there a way to accomplish this without running a server for redirection?

Technically, it isn't redirection that you'd need, to accomplish this. You'd need path rewriting, and that's why the answer to your ultimate question is "no" -- because Route 53 (and DNS in general) can't do anything related to paths.

Route 53 does support wildcard DNS, but that's of limited help without CloudFront and/or S3 supporting a mechanism to put the host header from the HTTP request into the path (which they don't).

Now, this could easily be accomplished in a "zero-touch" mode with a single Route 53 * wildcard entry, a single CloudFront distribution configured for *.example.com, and one or more EC2 instances running HAProxy to do the request path rewriting and proxy the request onward to the S3 bucket. A single line in a basic configuration file would accomplish that request rewrite:

http-request set-path /%[req.hdr(host)]%[path] 

Then you'd need the proxy to send the the actual bucket endpoint hostname to S3, instead of the hostname supplied by the browser:

http-request set-header Host example-bucket.s3.amazonaws.com

The proxy would send the modified request to S3, return S3's response to CloudFront, which would return the response to the browser.

However, if you don't want to take this approach, since a server would be required, then the alternative solution looks like this:

  • Configure a CloudFront distribution for each subdomain, setting the alternate domain name for the distribution to match the specific subdomain.

  • Configure the Origin for each subdomain's distribution to point to the same bucket, setting the origin path to /one-specific-subdomain.example.com. CloudFront will change a request for GET /images/funny-cat.jpg HTTP/1.1 to GET /one-specific-subdomain.example.com/images/funny-cat.jpg HTTP/1.1 before sending the request to S3, resulting in the behavior you described. (This is the same net result as the behavior I described for HAProxy, but it is static, not dynamic, hence one distribution per subdomain; in neither case would this be a "redirect" -- so the address bar would not change).

  • Configure an A-record Alias in Route 53 for each subdomain, pointing to the subdomain's specific CloudFront distribution.

This can all be done programmatically through the APIs, using any one of the the SDKs, or using aws-cli, which is a very simple way to test, prototype, and script such things without writing much code. CloudFront and Route 53 are both fully automation-friendly.

Note that there is no significant disadvantage to each site using its own CloudFront distribution, because your hit ratio will be no different, and distributions do not have a separate charge -- only request and bandwidth charges.

Note also that CloudFront has a default limit of 200 distributions per AWS account but this is a soft limit that can be increased by sending a request to AWS support.

Felic answered 5/6, 2016 at 1:26 Comment(5)
Thanks for a detailed answer! I will probably use your first "zero-touch" method, since creating separate CF distributions seems inefficient. So, using a server with HAProxy with path rewriting wouldn't add any considerable resource overhead or response latency, would it?Stocker
Also, if my CloudFront distribution origin is set to a S3 bucket, would I still need HAProxy to send a request to S3 as you suggested and then to CloudFront? Can't I can just proxy it straight to the CloudFront distribution?Stocker
CloudFront sends the request to HAProxy, which would be what you'd set as the origin server in CloudFront. The proxy then sends the request to S3 after modifying it. As long as the proxy is in the same region as S3 the additional latency would be very small. You wouldn't want the proxy before CloudFront, because that would cause all requests to use the CloudFront edge nearest to the proxy, instead of using the one nearest to the browser making the request, defeating the advantage of CloudFront caching content nearest to the users.Felic
The CloudFront cache behavior would also need to be configured to whitelist the Host header for forwarding to the proxy, otherwise the proxy won't know the hostname originally sent by the browser to CloudFront.Felic
@Clintm your comment would have been sufficient. No downvote was necessary. The accepted answer can't be deleted or moved down the page, but I've made an edit.Felic
A
7

Since Lambda@edge this can be done with a lambda function triggered by the Cloud Front "Viewer Request" event.

Here is an example of such a Lambda function where a request like foo.example.com/index.html will return the file /foo/index.html from your origin.

You will need a CF distribution with the CNAME *.example.com, and an A record "*.example.com" pointing to it

exports.handler = (event, context, callback) => {
  const request = event.Records[0].cf.request;
  const subdomain = getSubdomain(request);
  if (subdomain) {
    request.uri = '/' + subdomain + request.uri;
  }
  callback(null, request);
};

function getSubdomain(request) {
  const hostItem = request.headers.host.find(item => item.key === 'Host');
  const reg = /(?:(.*?)\.)[^.]*\.[^.]*$/;
  const [_, subdomain] = hostItem.value.match(reg) || [];
  return subdomain;
}

As for the costs take a look at lambda pricing. At current pricing is 0.913$ per million requests

Alleris answered 12/2, 2018 at 11:58 Comment(1)
Has anyone been able to get this approach working using CloudFunctions and not Lambda@Edge? When I expose the Host header to my CloudFunction to grab the subdomain, I run into trouble with a SignatureDoesNotMatch error.Douce
I
-1

A wildcard works on S3. I just put an A record * that points to an IP and it worked.

Incommunicable answered 10/10, 2016 at 2:45 Comment(2)
I've suggested an edit. In the future please try to use proper capitalization and grammar.Copulative
Doesn't answer the questionDefelice

© 2022 - 2024 — McMap. All rights reserved.