Websocket SSL handshake failure
Asked Answered
P

3

7

I have spring-boot Tomcat server for secure websocket connections. The server accepts Android 4.4, iOS, Firefox, and Chrome clients without failure with an authority-signed certificate. Android 5.0, however, fails the SSL handshake.

Caused by: javax.net.ssl.SSLHandshakeException: Handshake failed
        at com.android.org.conscrypt.OpenSSLEngineImpl.unwrap(OpenSSLEngineImpl.java:436)
        at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:1006)
        at org.glassfish.grizzly.ssl.SSLConnectionContext.unwrap(SSLConnectionContext.java:172)
        at org.glassfish.grizzly.ssl.SSLUtils.handshakeUnwrap(SSLUtils.java:263)
        at org.glassfish.grizzly.ssl.SSLBaseFilter.doHandshakeStep(SSLBaseFilter.java:603)
        at org.glassfish.grizzly.ssl.SSLFilter.doHandshakeStep(SSLFilter.java:312)
        at org.glassfish.grizzly.ssl.SSLBaseFilter.doHandshakeStep(SSLBaseFilter.java:552)
        at org.glassfish.grizzly.ssl.SSLBaseFilter.handleRead(SSLBaseFilter.java:273)
        at org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(ExecutorResolver.java:119)
        at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:284)
        at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:201)
        at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:133)
        at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112)
        at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77)
        at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:561)
        at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112)
        at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerThreadIOStrategy.java:117)
        at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.access$100(WorkerThreadIOStrategy.java:56)
        at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy$WorkerThreadRunnable.run(WorkerThreadIOStrategy.java:137)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:565)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:545)
at java.lang.Thread.run(Thread.java:818)
 Caused by: javax.net.ssl.SSLProtocolException: SSL handshake terminated: ssl=0xa1f34200: Failure in SSL library, usually a protocol error
error:1408E0F4:SSL routines:SSL3_GET_MESSAGE:unexpected message (external/openssl/ssl/s3_both.c:498 0xac526e61:0x00000000)
        at com.android.org.conscrypt.NativeCrypto.SSL_do_handshake_bio(Native Method)
        at com.android.org.conscrypt.OpenSSLEngineImpl.unwrap(OpenSSLEngineImpl.java:423)

I think the problem is with TLS or the cipher suites due to changes in Android 5.0 Lollipop, and not with the certificates because the other clients connect, but I cannot figure out how to tell what is happening on the client side of the connection because SSL debugging does not appear to be supported on Android. The problem is likely very similar to this one, which is also not resolved yet but suggests the problem is with cipher suites. The Android bugs 88313 81603 developer-preview-1989 seem to indicate the Android implementation is correct but server configuration or implementation of cipher suites may not be.

I have set the following server cipher suites

server.ssl.ciphers = TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_DSS_WITH_AES_128_CBC_SHA

In particular, the TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA is on the list of supported protocols for Android for API 11+.

I verified the server supports this

openssl s_client -connect server:port

which returns

SSL-Session:
Protocol  : TLSv1.2
Cipher    : ECDHE-RSA-AES128-SHA

There is a slight mismatch in names between openssl and java, but the openssl documentation says these are the same cipher suite.

My server supports and negotiates first a cipher suite with the openssl client that is compatible with Android 5.0. I expect Android 5.0 to connect without issue, but it fails.

Has anyone successfully connected Android 5.0 secure websocket connections to Tomcat? Are there cipher suites that are known to work? Is there a way to debug the Android client side SSL implementation?


UPDATE

Network trace results:

SYN -->
<-- SYN, ACK
ACK -->
<-- Data
ACK -->
<-- certificates, SSL/TLS params? 1
<-- 2
<-- 3
<-- 4
ACK --> 
ACK --> 
ACK --> 
FIN(!), ACK --> 

When the Android 5.0 device (a Nexus 5) receives the server certificate information sent in 4-5 packets, it responds with a variable number (2-4) ACKs then a FIN, ACK. In the successful trace, the client does not send a FIN. The Android 5 client does not like something it gets from the server.

For the failure, the server SSL debugging info says:

http-nio-8080-exec-10, called closeOutbound()
http-nio-8080-exec-10, closeOutboundInternal()
http-nio-8080-exec-10, SEND TLSv1.2 ALERT:  warning, description = close_notify
http-nio-8080-exec-10, WRITE: TLSv1.2 Alert, length = 2
[Raw write]: length = 7
0000: 15 03 03 00 02 01 00 

UPDATE 2

Here is a bare-bones Tyrus Android application to use

package edu.umd.mindlab.androidssldebug;

import android.support.v7.app.ActionBarActivity;
import android.os.Bundle;
import android.util.Log;
import android.view.Menu;
import android.view.MenuItem;
import android.widget.TextView;

import org.glassfish.tyrus.client.ClientManager;

import java.io.ByteArrayOutputStream;
import java.io.PrintStream;
import java.net.URI;

import javax.websocket.ClientEndpoint;
import javax.websocket.CloseReason;
import javax.websocket.OnClose;
import javax.websocket.OnError;
import javax.websocket.OnMessage;
import javax.websocket.OnOpen;
import javax.websocket.Session;

@ClientEndpoint
public class MainActivity extends ActionBarActivity {
    public static final String TAG = "edu.umd.mindlab.androidssldebug";
    final Object annotatedClientEndpoint = this;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
    }

    @Override
    protected void onStart(){
        super.onStart();
        final Object annotatedClientEndpoint = this;
        new Thread(new Runnable(){
            @Override
            public void run() {
                try {
                    URI connectionURI = new URI("wss://mind7.cs.umd.edu:8080/test");
                    ClientManager client = ClientManager.createClient();
                    Object clientEndpoint = annotatedClientEndpoint;
                    client.connectToServer(clientEndpoint, connectionURI);
                }
                catch(Exception e){
                    ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
                    PrintStream printStream = new PrintStream(byteStream);
                    e.printStackTrace(printStream);
                    final String message = byteStream.toString();
                    Log.e(TAG, message);
                    e.printStackTrace();
                    runOnUiThread(new Runnable() {
                        public void run() {
                            TextView outputTextView = (TextView) findViewById(R.id.outputTextView);
                            outputTextView.setText(message);
                        }
                    });
                }
            }
        }).start();

    }

    @Override
    public boolean onCreateOptionsMenu(Menu menu) {
        // Inflate the menu; this adds items to the action bar if it is present.
        getMenuInflater().inflate(R.menu.menu_main, menu);
        return true;
    }

    @Override
    public boolean onOptionsItemSelected(MenuItem item) {
        // Handle action bar item clicks here. The action bar will
        // automatically handle clicks on the Home/Up button, so long
        // as you specify a parent activity in AndroidManifest.xml.
        int id = item.getItemId();

        //noinspection SimplifiableIfStatement
        if (id == R.id.action_settings) {
            return true;
        }

        return super.onOptionsItemSelected(item);
    }

    @OnOpen
    public void onOpen(Session session) {
        Log.i(TAG, "opened");
        runOnUiThread(new Runnable() {
            public void run() {
                TextView outputTextView = (TextView) findViewById(R.id.outputTextView);
                outputTextView.setText("opened");
            }
        });

    }

    @OnMessage
    public void onMessage(String message, Session session) {
        Log.i(TAG, "message: " + message);
    }

    @OnClose
    public void onClose(Session session, CloseReason closeReason) {
        Log.i(TAG, "close: " + closeReason.toString() );
    }

    @OnError
    public void onError(Session session, Throwable t) {
        final String message = "error: " + t.toString();
        Log.e(TAG, message);
        runOnUiThread(new Runnable() {
            public void run() {
                TextView outputTextView = (TextView) findViewById(R.id.outputTextView);
                outputTextView.setText(message);
            }
        });
    }

}
Photojournalism answered 18/1, 2015 at 15:47 Comment(11)
Please check the servers log for error messages to narrow down the problem. If you don't find anything make packet captures of a successful and an unsuccessful connection and compare. If you need help with interpreting the packet captures please post them to cloudshark.org.Suchlike
@Photojournalism - what version of OpenSSL is the server using? Try $ openssl version from a terminal.Currycomb
OpenSSL 1.0.1e-fips 11 Feb 2013 built on: Thu Nov 6 12:33:36 UTC 2014Photojournalism
@Currycomb I have a second server with OpenSSL 1.0.1j 15 Oct 2014 that shows the same behavior.Photojournalism
@Photojournalism - sorry to ask for this (I can't duplicate it with s_client)... I think we are going to need to see a PCAP. Can you collect one at the Tomcat server for the session with the Android 5.0 client?Currycomb
Let us continue this discussion in chat.Photojournalism
@Photojournalism - "I have a second server with OpenSSL 1.0.1j" - now that is interesting...Currycomb
@Photojournalism - when you say "with a WebSocket", do you mean the first connection to the site is OK (when fetching the page), and then subsequent use of the connection (via WebSocket) dies?Currycomb
@Photojournalism - I removed the Nginx stuff from the question. It was kind of distracting since it did not contribute to the problem. You might want to post your PCAPs somewhere so others can take a look at them. But I suspect this is an Android bug, and the way to solve the mystery is to get an s_client built that uses Android's patched OpenSSL sources.Currycomb
@Currycomb There is no page load on the server with the SSL handshake failure. The clients are opening websocket connections directly to the server (Browser clients load a page from a different server and then open websocket). I will try to setup a server at a public location later this week that I can make accessible to anyone for an extended period of time for debugging this issue.Photojournalism
Also see Disable all elliptic curves except secp256 for TLS?Currycomb
J
0

The suggested fix at TYRUS-402 resolves this. I have opened a corresponding Grizzly Bug GRIZZLY-1827 which has the corresponding patch.

Update: The bug GRIZZLY-1827 has been fixed.

Jahdiel answered 24/3, 2016 at 22:13 Comment(1)
To increase the quality of this answer, please summarize the suggested fix that you link to.Deleterious
C
3
error:1408E0F4:SSL routines:SSL3_GET_MESSAGE:unexpected message (external/openssl/ssl/s3_both.c:498 0xac526e61:0x00000000)
        at com.android.org.conscrypt.NativeCrypto.SSL_do_handshake_bio(Native Method)
        at com.android.org.conscrypt.OpenSSLEngineImpl.unwrap(OpenSSLEngineImpl.java:423)

0x1408E0F4 is:

$ openssl errstr 0x1408E0F4
error:1408E0F4:SSL routines:SSL3_GET_MESSAGE:unexpected message

It shows up in the OpenSSL sources at a couple of places:

$ cd openssl-1.0.1l
$ grep -R SSL3_GET_MESSAGE *
ssl/s3_both.c:          SSLerr(SSL_F_SSL3_GET_MESSAGE,SSL_R_UNEXPECTED_MESSAGE);
ssl/s3_both.c:          SSLerr(SSL_F_SSL3_GET_MESSAGE,SSL_R_UNEXPECTED_MESSAGE);
ssl/s3_both.c:          SSLerr(SSL_F_SSL3_GET_MESSAGE,SSL_R_EXCESSIVE_MESSAGE_SIZE);
ssl/s3_both.c:          SSLerr(SSL_F_SSL3_GET_MESSAGE,SSL_R_EXCESSIVE_MESSAGE_SIZE);
ssl/s3_both.c:          SSLerr(SSL_F_SSL3_GET_MESSAGE,ERR_R_BUF_LIB);

Here's the code I believe is causing the trouble (line numbers have changed, and the SSLerr is at 491):

/* Obtain handshake message of message type 'mt' (any if mt == -1),
 * maximum acceptable body length 'max'.
 * The first four bytes (msg_type and length) are read in state 'st1',
 * the body is read in state 'stn'.
 */
long ssl3_get_message(SSL *s, int st1, int stn, int mt, long max, int *ok)
    {
    ...

    /* s->init_num == 4 */
    if ((mt >= 0) && (*p != mt))
        {
        al=SSL_AD_UNEXPECTED_MESSAGE;
        SSLerr(SSL_F_SSL3_GET_MESSAGE,SSL_R_UNEXPECTED_MESSAGE);
        goto f_err;
        }
    ...

But I'm not sure what causes that particular problem. See this question on the OpenSSL User List at SSL_F_SSL3_GET_MESSAGE and SSL_R_UNEXPECTED_MESSAGE.

EDIT: according to the Android source for s3_both.c, that is the code that's triggering the issue.

-----

OK, looking at the file successful.pcap and unsuccessful.pcap, the good client is using TLS 1.0 while the misbehaving client is using TLS 1.2. But I don't see anything offensive that would cause the client to close the connection while processing the four messages (Server Hello, Certificate, Server Key Exchange, Server Hello Done) in the Record.

-----

Based on the ServerKeyExchange message:

enter image description here

The server selected the client's offering of secp521r1. You might want to use secp256. That's most interoperable right now. Also see Is the limited elliptic curve support in rhel/centos/redhat openssl robust enough?.

-----

OpenSSL 1.0.1e FIPS used by the server has suffered a few problems. See, for example:

If possible, you might want to upgrade it to something newer.

-----

Is there a way to debug the Android client side SSL implementation?

I think this is an easier question. Use a custom SSLSocketFactory like SSLSocketFactoryEx. It will allow you to try different protocols, cipher suites and settings. But its trial-and-error.

Otherwise, you would need to grab a copy of the OpenSSL source code used by Android 5.0 (including patches). I don't know how to get that and ensure it builds like mainline OpenSSL (effectively, you need to build s_client using Android sources with debugging information).

This might be helpful: OpenSSL on Android. From the looks of the diffs, it appears Android is using OpenSSL 1.0.0. (Some of the patches in the patch/ directory specifically call out 1.0.0b).

Currycomb answered 18/1, 2015 at 20:16 Comment(2)
This particular trace is for OpenSSL 1.0.1j. 192.168.0.109 is the server and 192.168.0.100 is the client. Isn't it the client closing the connection in frame 14 when it sends the first FIN? I set the TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA cipher in the server configuration because it is in the intersection of ciphers that are supported by Android 4.4 and Android 5. My first goal is to get any cipher suite negotiated successfully between Android 5 and the server. I would prefer a SHA-256 or SHA-384 based suite, but I have not found any server configuration that successfully negotiates any cipher.Photojournalism
@Photojournalism - "Isn't it the client closing the connection in frame 14 ..." - yes, my bad. I misread which side sent the close.Currycomb
P
1

This is confirmed to be caused by an Android 5.0 bug. It is unclear to me currently whether there is also a problem in Tyrus websocket or Grizzly.

See also: 93740 and preview 328.

Photojournalism answered 23/1, 2015 at 4:53 Comment(0)
J
0

The suggested fix at TYRUS-402 resolves this. I have opened a corresponding Grizzly Bug GRIZZLY-1827 which has the corresponding patch.

Update: The bug GRIZZLY-1827 has been fixed.

Jahdiel answered 24/3, 2016 at 22:13 Comment(1)
To increase the quality of this answer, please summarize the suggested fix that you link to.Deleterious

© 2022 - 2024 — McMap. All rights reserved.