I have a small application that simply polls a server using Jetty v9.2, HttpClient. After some days, the application will freeze-up. Initially we identified the thread pool needed to be increased in size to relieve a performance hit. That change restored performance over a period of days. The lock-up remains. The cause has been isolated to the HTTP GET calls (problem goes away if when we comment-out the method).
The root cause which appears the underlying the Jetty HttpClient Connection management or Thread management. Normally Jetty HttpClient makes a set of threads to handle the HTTP GET (see below), these run-up and vanish as you'd expect. After around 40 hours or operation, the JDK VisualVM shows at least 9 connection threads that do not go away immediately:
- HttpClient - scheduler x 1
- HttpClient - selector client SeclectorManager x 4
- HttpClient x 4
also
- RMI TCP connection
Nine or 10 threads in total. On the next read, new thread instances are created to carry the load and the client proceeds. Furthermore the app. has a clock with a dedicated thread which continues running after the the application locks-up, which indicating the JVM, operating system and the machine itself are fine.
Sometimes, we see these 'stuck' threads linger for up to an hour before they drop out of the VisualVM thread display. After at least 36 hours we see threads remain and we've not seen them go away.
After enough days the software locks-up. The indicated explanation is the leaking of thread instances that have not been cleaned-up. it appears the app. runs-out of threads and can't do more work. It certainly stops HTTP GETs as witnessed by server-logs.
The main HTTP call uses the code below, HttpClient GET method:
/**
* GET
* @return null or string returned from server
**/
public static String get( final String command ){
String rslt = null;
final String reqStr = "http://www.google.com"; // (any url)
HttpClient httpClient = new HttpClient();
Request request;
ContentResponse response;
try {
//-- Start HttpClient
httpClient.start();
request = httpClient.newRequest( reqStr );
response = request.send();
if( null == response ){
LOG.error( "NULL returned from previous HTTP request.");
}
else {
if( (501 == response.getStatus()) || (502 == response.getStatus()) ){
setNetworkUnavailable(String.format("HTTP Server error: %d", response.getStatus() ));
}
else {
if( 404 == response.getStatus() ){
Util.puts(LOG,"HTTP Server error: 404");
// ignore message since we are talking to an old server
}
else if( 200 == response.getStatus() ){
rslt = response.getContentAsString();
}
else {
LOG.error(String.format( " * Response status: \"%03d\".", response.getStatus() ));
}
setNetworkAvailable();
}
}
}
catch ( InterruptedException iEx ){
LOG.warn( "InterruptException processing: "+reqStr, iEx );
}
catch ( Exception ex ){
Throwable cause = eEx.getCause();
if( (cause instanceof NoRouteToHostException) ||
(cause instanceof EOFException) ||
(cause instanceof SocketException)
&& cause.getMessage().startsWith( EX_NETWORK_UNREACHABLE ) ){
setNetworkUnavailable( cause.getMessage() );
}
else {
LOG.error( "Exception on: "+command, ex );
}
}
finally {
try {
httpClient.stop();
}
catch ( Exception ex ){
LOG.error( "Exception httpClient.stop(), ServerManager::get()", ex );
}
}
return rslt;
}//get method
This is based on simple examples, there is scant detail on use of the HttpClient. Is everything done according to Hoyle?
At different execution runs we also see the following Exceptions and log messages:
- [36822522] WARN 2014-Sep-02 02:46:28.464> HttpClient@2116772232{STOPPING,8<=0<=200,i=0,q=0} Couldn't stop Thread[HttpClient@2116772232-729770,5,]
We wonder if this message relates to one of the stuck threads? Or, does this message indicate a separate and different problem we need to examine? Also:
- java.util.concurrent.TimeoutException (ExecutionException)
This appears to be a thread timeout exception. Which thread though? Does this relate to he HTTP connection-threads? I think as a minimum when services catch errors internally that they can at least indicate the location of the error and a stack-trace.
There are some obvious questions:
- Is the get() method code written as required to not have leaks or leave resources hanging for the Jetty HttpClient code?
- How can we catch the warning: "Couldn't stop Thread" error?
- What is the impact of this error? Is there a way to 'smash' a thread stuck like that?
- Does this relate to the 10 hanging connection threads anyway? There's only one warning message.
- One imagine a hanging thread warrants an ERROR label, not a warning.
- Is there a process to catch thread errors and errors in general in the Jetty HttpClient?
- What attributes are available for the HttpClient to tune the service?
- Are there settings we can use to directly influence the thread-locking?
- What attributes are available in the HttpClient's environment or context to controltune the service?
- Can the Jetty HttpClient be restarted / rebooted or just stopped?
- Jetty calls are only made in the GET method shown (albeit with more logging, etc.)
- Does the RMI thread factor as part of the Jetty HttpClient calls?
One other observation is that when we "stuck" threads in VisualVM, it shows excess Daemon threads in the Threads panel, not an increase in non-Daemon threads.
By running the code shown above in a for loop for about 3 or 4 hours with a 250 millisecond break between HttpClient send() calls shows a thread leak -- It is simple to reproduce on Linux. The log output shows no WARNings and only two timeout errors on the network at least 30 minutes distance from the thread leak.
Suggestions, observations, improvements and answers are most welcome. Our thanks in advance.
Related questions:
These questions cover some very similar points