HAProxy Loadbalancing TCP traffic
Asked Answered
I

1

14

Using HAProxy, I'm trying to (TCP) load balance Rserve(a service listening in TCP socket for calling R scripts) running at port 6311 in 2 nodes.

Below is my config file. When I run HAProxy, its statting without any issues. But when I connect to the balanced nodes, getting below error. Anything wrong with the config?

Handshake failed: expected 32 bytes header, got -1

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    tcp
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    #option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000


listen haproxy_rserve
        bind *:81
        mode tcp
        option tcplog
        timeout client  10800s
        timeout server  10800s
        balance leastconn
        server rserve1 rserveHostName1:6311
        server rserve2 rserveHostName2:6311

listen stats proxyHostName:8080
    mode http
    stats enable
    stats realm Haproxy\ Statistics 
    stats uri /haproxy_stats
    stats hide-version
    stats auth admin:password

Tried with below frontend-backend way of balancing as well. Same result.

frontend haproxy_rserve
    bind *:81
    mode tcp
    option tcplog
    timeout client  10800s
    default_backend rserve

backend rserve
    mode tcp
    option tcplog
    balance leastconn
    timeout server  10800s  
    server rserve1 rserveHostName1:6311
    server rserve2 rserveHostName2:6311 
Infiltrate answered 18/8, 2016 at 10:49 Comment(0)
I
34

After struggling for a week for a solution to load balance R, below (full free/open source software stack) solution worked.

If more people are referring this, I'll post a detailed blog on installation to configuration.

Was able to load balance R script requests coming to Rserve via HAProxy TCP load balancer with the below config. Pretty much similar to config in question section, but with frontend and backend separated.

#Load balancer stats page access at hostname:8080/haproxy_stats
listen stats <load_balancer_hostname>:8080
    mode http
    log global
    stats enable
    stats realm Haproxy\ Statistics 
    stats uri /haproxy_stats
    stats hide-version
    stats auth admin:admin@rserve

frontend rserve_frontend
    bind *:81
    mode tcp
    option tcplog
    timeout client  1m
    default_backend rserve_backend

backend rserve_backend
    mode tcp
    option tcplog
    option log-health-checks
    option redispatch
    log global
    balance roundrobin
    timeout connect 10s
    timeout server 1m   
    server rserve1 <rserve hostname1>:6311 check
    server rserve2 <rserve hostname2>:6311 check
    server rserve3 <rserve hostname3>:6311 check

If SELinux is enabled, the below command will enable remote connections for HAproxy

/usr/sbin/setsebool -P haproxy_connect_any 1

The firewall ports might need opening too:

firewall-cmd --permanent --zone=public --add-port=81/tcp
firewall-cmd --permanent --zone=public --add-port=8080/tcp

Also, enable remote connections in Rserve with remote enable in the Rserve config file.

Infiltrate answered 20/8, 2016 at 8:19 Comment(6)
hi - this is very interesting. why did you not use nginx ? since most of us use nginx, it would be very cool to see a solution there. Do note that nginx supports tcp (using stream and/or proxy protocol)Woody
@Woody no specific reason for not to using nginx. Found haproxy is exclusive for reverse proxy and doing it well for decades and I wouldn't need anything else that nginx offers(like web server). Yes. we can use just reverse proxy part of it. But just got settled with haproxy.Infiltrate
@Woody Here is nginx implementation of TCP load balancing example. nginx.com/resources/admin-guide/tcp-load-balancingInfiltrate
@Woody We currently use nginx for tcp reverse proxy but are looking to switch to HAProxy because nginx only has passive health checks unless you upgrade to nginx+. We are not very keen on paying 2.5k per year when HAProxy does it for free.Villiers
@MichaelHobbs No experience with it yet apart from current research, but it (HAProxy) does indeed look very promising in terms of its load balancing capabilities. The built-in stats page is indeed, very nice.Kenney
@Infiltrate /usr/sbin/setsebool is the utility to configure selinux. It means you are running on CentOS/RHEL and have SELinux enabled. You can Google what SELinux is and why most sysadmins disable it the first minute they setup the OS. In short it's supposed to be a security layer, that prevents to read files or open network sockets among other things, breaking applications all the time with no indication of what's going on.Subatomic

© 2022 - 2024 — McMap. All rights reserved.