Seeking advice on Jetty HttpClient Hang
Asked Answered
T

1

2

I have a small application that simply polls a server using Jetty v9.2, HttpClient. After some days, the application will freeze-up. Initially we identified the thread pool needed to be increased in size to relieve a performance hit. That change restored performance over a period of days. The lock-up remains. The cause has been isolated to the HTTP GET calls (problem goes away if when we comment-out the method).

The root cause which appears the underlying the Jetty HttpClient Connection management or Thread management. Normally Jetty HttpClient makes a set of threads to handle the HTTP GET (see below), these run-up and vanish as you'd expect. After around 40 hours or operation, the JDK VisualVM shows at least 9 connection threads that do not go away immediately:

  • HttpClient - scheduler x 1
  • HttpClient - selector client SeclectorManager x 4
  • HttpClient x 4

also

  • RMI TCP connection

Nine or 10 threads in total. On the next read, new thread instances are created to carry the load and the client proceeds. Furthermore the app. has a clock with a dedicated thread which continues running after the the application locks-up, which indicating the JVM, operating system and the machine itself are fine.

Sometimes, we see these 'stuck' threads linger for up to an hour before they drop out of the VisualVM thread display. After at least 36 hours we see threads remain and we've not seen them go away.

After enough days the software locks-up. The indicated explanation is the leaking of thread instances that have not been cleaned-up. it appears the app. runs-out of threads and can't do more work. It certainly stops HTTP GETs as witnessed by server-logs.

The main HTTP call uses the code below, HttpClient GET method:

 /**
  *   GET
  *   @return null or string returned from server
  **/
 public static String get( final String command ){

    String          rslt        = null;
    final String    reqStr      = "http://www.google.com";  //  (any url)

    HttpClient      httpClient  = new HttpClient();
    Request         request;
    ContentResponse response;

    try {
            //-- Start HttpClient
        httpClient.start();

        request   = httpClient.newRequest( reqStr );

        response  = request.send();

        if( null == response ){
            LOG.error( "NULL returned from previous HTTP request.");
        }
        else {
            if( (501 == response.getStatus()) || (502 == response.getStatus()) ){
                setNetworkUnavailable(String.format("HTTP Server error: %d", response.getStatus() ));
            }
            else {
                if(  404 == response.getStatus() ){
                    Util.puts(LOG,"HTTP Server error: 404");
    //              ignore message since we are talking to an old server
                }
                else if( 200 == response.getStatus() ){
                    rslt = response.getContentAsString();
                }
                else {
                    LOG.error(String.format( "    * Response status: \"%03d\".", response.getStatus() ));
                }
                setNetworkAvailable();
            }
        }
    }
    catch ( InterruptedException iEx ){
        LOG.warn( "InterruptException processing: "+reqStr, iEx );
    }
    catch ( Exception ex ){

        Throwable cause = eEx.getCause();
        if( (cause instanceof NoRouteToHostException) ||
            (cause instanceof EOFException)           ||
            (cause instanceof SocketException)
                && cause.getMessage().startsWith( EX_NETWORK_UNREACHABLE ) ){

            setNetworkUnavailable( cause.getMessage() );
        }
        else {
            LOG.error( "Exception on: "+command, ex );
        }
    }
    finally {
        try {
            httpClient.stop();
        }
        catch ( Exception ex ){
            LOG.error( "Exception httpClient.stop(), ServerManager::get()", ex );
        }
    }

    return rslt;

}//get method

This is based on simple examples, there is scant detail on use of the HttpClient. Is everything done according to Hoyle?

At different execution runs we also see the following Exceptions and log messages:

  • [36822522] WARN 2014-Sep-02 02:46:28.464> HttpClient@2116772232{STOPPING,8<=0<=200,i=0,q=0} Couldn't stop Thread[HttpClient@2116772232-729770,5,]

We wonder if this message relates to one of the stuck threads? Or, does this message indicate a separate and different problem we need to examine? Also:

  • java.util.concurrent.TimeoutException (ExecutionException)

This appears to be a thread timeout exception. Which thread though? Does this relate to he HTTP connection-threads? I think as a minimum when services catch errors internally that they can at least indicate the location of the error and a stack-trace.

There are some obvious questions:

  1. Is the get() method code written as required to not have leaks or leave resources hanging for the Jetty HttpClient code?
  2. How can we catch the warning: "Couldn't stop Thread" error?
    • What is the impact of this error? Is there a way to 'smash' a thread stuck like that?
    • Does this relate to the 10 hanging connection threads anyway? There's only one warning message.
    • One imagine a hanging thread warrants an ERROR label, not a warning.
  3. Is there a process to catch thread errors and errors in general in the Jetty HttpClient?
  4. What attributes are available for the HttpClient to tune the service?
    • Are there settings we can use to directly influence the thread-locking?
  5. What attributes are available in the HttpClient's environment or context to controltune the service?
  6. Can the Jetty HttpClient be restarted / rebooted or just stopped?
    • Jetty calls are only made in the GET method shown (albeit with more logging, etc.)
  7. Does the RMI thread factor as part of the Jetty HttpClient calls?

One other observation is that when we "stuck" threads in VisualVM, it shows excess Daemon threads in the Threads panel, not an increase in non-Daemon threads.

By running the code shown above in a for loop for about 3 or 4 hours with a 250 millisecond break between HttpClient send() calls shows a thread leak -- It is simple to reproduce on Linux. The log output shows no WARNings and only two timeout errors on the network at least 30 minutes distance from the thread leak.

Suggestions, observations, improvements and answers are most welcome. Our thanks in advance.

Related questions:

These questions cover some very similar points

Tetragon answered 2/9, 2014 at 4:3 Comment(0)
T
2

This situation seems to be resolved by ensuring two things.

  1. Ensuring there are enough threads in the application's thread pool
  2. Making sure the code using Jetty clean's up and catches/manages all exceptions.

The two actions are inter-related. If sometimes the HttpClient misses an exception, or error, the thread hangs around. It seems the only way to avoid this is to ensure every HttpCLient used calls HttpCLient.stop(). This needs to go in a finally {...} clause.

Secondly async calls must wait for the CompleteListener before calling HttpCLient.stop(). That seems to be the only way to ensure the stop was 'cleanly' done. For some cases, stop() calls appear to proceed OK. Eventually some will cause Exceptions and you application slowly leaks resources. The appearance is like the JVM has frozen, but some non-deamon tasks may continue (e.g. a GUI thread) and you may not notice the problem until the PC itself runs out of resources or crashes. That is an extreme case ## Heading ##running over several weeks.

A reliable example to appropriately close the HttpClient is shown here:

The number of threads will depend on your application. I suggest using jVisualVM or something similar to ensure your Jetty threads are all cleaning-up properly first, before tuning the number of threads in your thread pool.

I feel that the documentation needs to stress the cleaning up and ensuring stop() is called. And the information of how to conclude an Async call is undocumented as far as I can tell. As long as your Jetty calls stop cleanly, then providing sufficient threads appears to resolve this -- With the usual caveats to manage concurrency.

Tetragon answered 29/10, 2014 at 23:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.