What's the best practice for getting EC2 instances to join HAProxy automatically?
Asked Answered
A

1

10

We're working on scaling out our EC2 architecture to a point where we'd like to manage our own load balancing. We currently have a series of machines configured on HAProxy to do basic load balancing, but we're looking for the 'best practice' means to have a new instance come online and automatically (or nearly automatically) join HAProxy.

Ideally, we'd monitor load on our systems or rely on a few years worth of analytics data to work out a rouch schedule, and when we reach a threshold or scheduled time, have a process fire up a new instance, have that new node connect to a system on our HAProxy machine to write its hostname into the config and reload HAProxy so it becomes part of the pool.

We're considering Amazon's ELB once we grow big enough to need multiple zone coverage, but until then, we need a simple setup that can add/remove machines from HAProxy.

I know there are services out there that we can pay to manage this stuff, but Scalr seems to limit us to very specific instance types, and Rightscale is too expensive, so like many others, we're looking to roll our own solution.

Unfortunately, those who roll their own solution seem to be a little hush-hush on their process.

Albanian answered 6/7, 2011 at 23:8 Comment(2)
What were your issues with scalr? you can use custom instance types, as long as you install the scalr client. we're using their nginx load balancer and it's working great until now.Vladi
Ah, didn't know Scalr would allow custom builds. My glance at their offering looked like they had preconfigured instances you had to use to work properly.Albanian
F
11

You don't need to over-think this solution ;)

You can simply "pre-configure" servers in your HAProxy configuration file. They will appear "down" and will never receive requests until you actually bring them online.

Here's an example, assuming you only have 5 machines online, and expect to have 10 in the next 2 years:

listen web *:80
    balance source
    server  web1 192.168.0.101:80 check inter 2000 fall 3
    server  web2 192.168.0.102:80 check inter 2000 fall 3
    server  web3 192.168.0.103:80 check inter 2000 fall 3
    server  web4 192.168.0.104:80 check inter 2000 fall 3
    server  web5 192.168.0.105:80 check inter 2000 fall 3
    server  web6 192.168.0.106:80 check inter 2000 fall 3
    server  web7 192.168.0.107:80 check inter 2000 fall 3
    server  web8 192.168.0.108:80 check inter 2000 fall 3
    server  web9 192.168.0.109:80 check inter 2000 fall 3
    server  web10 192.168.0.110:80 check inter 2000 fall 3

With this config, you won't need to restart HAProxy or do any kind of ugly hacks for at least a year (unless you need more than 10, then just add 100 and you'll be set).

You can also write a quick shell script to automatically generate this configuration, actually you SHOULD write a script for that if you're adding 100 servers to your pool.

Furcula answered 23/7, 2011 at 10:57 Comment(4)
Just a tweak to the sugestion above, and something i have been battling with for the last couple hours... Insted of pointing directly to the internal IP, since on Amazon you dont have a static internal IP, even though you can use Static External IPs, there is a charge for having them un-assigned, and you may have problems with firewalls/traffic chages, my sugestion is to use a DynDNS service (like dyndns.org) or a DNS provider who provides an API (like Zerigo) and set up some hosts on the machine. if you create, say 10 hosts, and assign the internal amazon IP to them, it should work.Bourguiba
since i ran out of space in the above comment, i will note: you may need to clean your DNS every now and again, since if you scale to, say 6 instances, and then back to just 4, the other 2 still have IPs, which could now belong to a new instance. maybe a script on startup and shutdown, and use the instance data in amazon to name the machine... thats the way i am going to fix my problem.Bourguiba
Update: at my new job since I last posted this question, I'm facing the same issue but on a scale of hundreds of machines using RackSpace, which will require a way to have those machines ping a daemon on the haproxy machine to get added to the configuration.Albanian
This is pretty clever. But just to reply to TiemanO's comment, you could easily spin up a VPC in Amazon (at no cost) so that you CAN have fixed IPs for your internal addresses on your own CIDR block. The only charges for VPC are for tunnels.Villenage

© 2022 - 2024 — McMap. All rights reserved.