Java process hanging on IOUtils. Suspected deadlock
Asked Answered
B

4

7

I have a java process that is hanging in a call to IOUtils.toString with the following code:

String html = "";
try {
    html = IOUtils.toString(someUrl.openStream(), "utf-8"); // process hangs on this line
} catch (Exception e) {
    return null;
}

It can't reproduce this reliably. It's part of a web crawler and so executes this line thousands of times successfully but ultimately causes the process to hang here after a few days.

Output from jstack:

2013-09-25 09:09:36
Full thread dump OpenJDK 64-Bit Server VM (20.0-b12 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007f2b1c001000 nid=0x225a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Thread-0" prio=10 tid=0x00007f2b34122000 nid=0x187b runnable [0x00007f2b30970000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:146)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
        - locked <0x00000000e3d2d160> (a java.io.BufferedInputStream)
        at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552)
        at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609)
        at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696)
        - locked <0x00000000e3d30558> (a sun.net.www.http.ChunkedInputStream)
        at java.io.FilterInputStream.read(FilterInputStream.java:133)
        at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2582)
        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
        - locked <0x00000000e3d317d0> (a java.io.InputStreamReader)
        at java.io.InputStreamReader.read(InputStreamReader.java:184)
        at java.io.Reader.read(Reader.java:140)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1364)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1340)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1315)
        at org.apache.commons.io.IOUtils.toString(IOUtils.java:525)

I can't see any way to set a timeout on the toString method. Any suggestions? Is this a bug in Apache commons? Or in my OpenJDK perhaps?

Brindle answered 25/9, 2013 at 9:0 Comment(2)
May be 'someUrl' shared between threads?Virile
ioutils is open source. Attach a debugger, pause the vm and see where it is stuck.Tellez
B
2

I've decided to try simply using guava IO instead since it was already in my classpath anyway:

String html = "";
try {
    InputSupplier<? extends InputStream> supplier = Resources
            .newInputStreamSupplier(metaUrl);
    html = CharStreams.toString(CharStreams.newReaderSupplier(supplier,
            Charsets.UTF_8));
} catch (Exception e) {
    return null;
}

It generally takes a few days to crash so if I don't update this answer in a few days, assume this worked!

Update : 7 days so far without hanging... :)

Brindle answered 25/9, 2013 at 10:39 Comment(1)
thanks for posting the solution! I got this in 2017 so I wonder why this is still not fixed..Flanch
R
4

Your call to toString() is ultimately forwarded to copyLarge(). Here you can see that reading from the stream is continued until an end of file (EOF) marker is detected by InputStream.read(). According to this post read() can read 0 bytes, i.e., if the URLConnection your reading from does not return an EOF marker the method keeps probably reading 0 bytes forever.

Maybe you can track down which URL causes the problem?

Anyways, to realize a timeout you could start each reading in a separate thread and kill that thread after a certain time elapsed.

Rizas answered 25/9, 2013 at 9:17 Comment(1)
Ahh, thanks. I'm not that fussed about which url caused it, but it may help me design a reproducible test. I'm trying guava as an alternative because it's simple, but if that fails, then I think running it in a separate thread may be my only option.Brindle
B
2

I've decided to try simply using guava IO instead since it was already in my classpath anyway:

String html = "";
try {
    InputSupplier<? extends InputStream> supplier = Resources
            .newInputStreamSupplier(metaUrl);
    html = CharStreams.toString(CharStreams.newReaderSupplier(supplier,
            Charsets.UTF_8));
} catch (Exception e) {
    return null;
}

It generally takes a few days to crash so if I don't update this answer in a few days, assume this worked!

Update : 7 days so far without hanging... :)

Brindle answered 25/9, 2013 at 10:39 Comment(1)
thanks for posting the solution! I got this in 2017 so I wonder why this is still not fixed..Flanch
F
1

I had the same problem. Maybe it gets solved by using guava but in my opinion the root of the problem is that the socket has no soTimeout configured.

try

socket.setSoTimeout(10000)

to throw a SocketTimeoutException when there is no EOF coming after 10 seconds.

Flanch answered 25/3, 2017 at 14:2 Comment(0)
L
0

Java native method:

InputStream in = new URL(url).openStream();

Guava method:

InputSupplier supplier = Resources.newInputStreamSupplier(new URL(url)); InputStream in = supplier.getInput();

Both of them will throw Connection timed out Exception. Because guave is also use URL.openStream()

But some site is so slow that I can read a little data from it every time, and so many many times still not reach end. And I also see it is hang there by Jstack.

Like this(maybe only slow at my host): a txt file address

Lakitalaks answered 24/8, 2015 at 5:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.