A load balancer will have some limit on how many tcp ports it can simultaneously use based on the platform it's running on (e.g. I read somewhere that Linux can have max 65535 tcp ports open simultaneously). This means that the balancer becomes a bottleneck and won't be able to serve beyond those many simultaneous requests even if the back-end server farm is capable of serving many more requests together. Is there some way to overcome this problem?
TCP and UDP port numbers are 16-bit, so a given IP has only 65535 of them (port 0 is not valid I believe). But a TCP connection is identified by the 4-tuple (source IP, source port, destination IP, destination port). (Looks like wikipedia has links to learn more.)
For the client->balancer requests: as long as each inbound connection has a distinct (source IP, source port), there's no problem. And the client normally ensures this. The only problems on this side I recall hearing of were with an extremely popular website with many images per page when accessed from enormous ISPs which NAT their customers behind very few IPv4 addresses. That's probably not your situation.
The balancer->backend requests are more interesting, as you're probably creating a similar situation to the NAT problem I mentioned above. I think Linux normally tries to assign a distinct ephemeral port to each socket, and by default there are only 28,233 of those. And IIRC it doesn't use ones in the TIME_WAIT
state either so you can exhaust the range without actually having that many connections open simultaneously. IIRC if you hit this limit you'll get an EADDRINUSE
failure on connect
(or on bind
if you explicitly bind the socket prior to connect). I don't remember exactly how I've gotten around this before, much less the absolute best way, but here are a few things that may help:
- keeping persistent balancer->backend connections rather than creating a new one for each (probably short-lived) client->balancer connection.
- setting
SO_REUSEADDR
on the sockets prior tobind
/connect
. - turning on the sysctl
net.ipv4.tcp_tw_reuse
and/ornet.ipv4.tcp_tw_recycle
. - explicitly picking the source IP and/or port to use via
bind
rather than letting the kernel autoassign onconnect
. You can't have two simultaneous connections with the same 4-tuple but anything else is fine. (Exception: I'm punting on thinking through whetherTIME_WAIT
reuse for the same 4-tuple is okay; I'd have to refresh my memory aboutTIME_WAIT
by reading through some TCP RFCs.)
You'll probably have to do a bit of experimentation. The good news is that once you understand the problem, it's pretty easy to reproduce it and test to see if you've fixed it.
© 2022 - 2024 — McMap. All rights reserved.