How to track down "Connection timout during SSL handshake" and "Connection closed during ssl handshake" errors
Asked Answered
T

3

8

I have recently switched over to HAProxy from AWS ELB. I am terminating SSL at the load balancer (HAProxy 1.5dev19).

Since switching, I keep getting some SSL connection errors in the HAProxy log (5-10% of the total number of requests). There's three types of errors repeating: Connection closed during SSL handshake Timeout during SSL handshake SSL handshake failure (this one happens rarely)

I'm using a free StartSSL certificate, so my first thought was that some hosts are having trouble accepting this certificate, and I didn't see these errors in the past because ELB offers no logging. The only issue is that some hosts have do have successful connections eventually.

I can connect to the servers without any errors, so I'm not sure how to replicate these errors on my end.

Thordis answered 7/7, 2013 at 12:43 Comment(0)
K
8

This sounds like clients who are going away mid-handshake (TCP RST or timeout). This would be normal at some rate, but 5-10% sounds too high. It's possible it's a certificate issue; I'm not certain exactly how that presents to

Things that occur to me:

  • If negotiation is very slow, you'll have more clients drop off.
  • You may have underlying TCP problems which you weren't aware of until your new SSL endpoint proxy started reporting them.

Do you see individual hosts that sometimes succeed and sometimes fail? If so, this is unlikely to be a certificate issue. I'm not sure how connections get torn down when a user rejects an untrusted certificate.

You can use Wireshark on the HAProxy machine to capture SSL handshakes and parse them (you won't need to decrypt the sessions for handshake analysis, although you could since you have the server private key).

Kilan answered 11/7, 2013 at 15:41 Comment(1)
Thanks Tim for a very thorough response. Actually it was exactly your first hypothesis, so I'll post the details here in case anyone has a similar problem. We used this backend to serve a number of Android apps that were sending analytics just as they were being shut down. Sometimes (often on Android, less often on iOS) there was not enough time to actually complete the request and the app would be killed during https negotiation, or immediately after, resulting in a BADREQ request marked by HAProxy. Eventually I did end up using ssldump and analysing exactly what was going wrong.Thordis
C
1

I had this happen as well. The following appeared first SSL handshake failure then after switching off option dontlognull we also got Timeout during SSL handshake in the haproxy logs.

At first, I made sure all the defaults timeouts were correct.

timeout connect 30s
timeout client  30s
timeout server  60s

Unfortunately, the issue was in the frontend section

There was a line with timeout client 60 which I only assume means 60ms instead of 60s.

It seems certain clients were slow to connect and were getting kicked out during the SSL handshake. Check your frontend for client timeouts.

Creek answered 22/9, 2016 at 18:15 Comment(1)
thanks, adnan. This was indeed the issue, I've documented this in my comment on Tim's answer.Thordis
C
0

How is your haproxy ssl frontend configured ?

For example I use the following to mitigate BEAST attacks : bind X.X.X.X:443 ssl crt /etc/haproxy/ssl/XXXX.pem no-sslv3 ciphers RC4-SHA:AES128-SHA:AES256-SHA

But some clients seem to generate the same "SSL handshake failure" errors. I think it's because the configuration is too restrictive.

Calf answered 4/3, 2014 at 13:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.