Java Resource InputStream being closed?
Asked Answered
M

1

8

I'm in the process of migrating our Java codebase from Java 7 (80) to Java 8 (162). (Yep... we're at the cutting edge of technology.)

After the switchover, I've been having issues when loading XML resource files from deployed jars in a heavily concurrent environment. The resource files are being accessed using try-with-resources and parsed via SAX:

try {
  SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
  try (InputStream in = MyClass.class.getResourceAsStream("resource.xml")) {
    parser.parse(in, new DefaultHandler() {...});
  }
} catch (Exception ex) {
  throw new RuntimeException("Error loading resource.xml", ex);
} 

Please correct me if I'm wrong, but this seems like the approach that is generally advised for reading resource files.

This works fine in an IDE, but once it has been deployed in a jar, I'm frequently (but not universally, and not always with the same resource file) getting an IOException, with the following stack trace:

Caused by: java.io.IOException: Stream closed 
    at java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:67)
    at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:142)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(XMLEntityManager.java:2919)
    at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:302)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1895)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanName(XMLEntityScanner.java:728)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1279)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327)
    at javax.xml.parsers.SAXParser.parse(SAXParser.java:195)

Questions:

  • What's going on here?

  • Am I doing something wrong, with how I'm reading/parsing these resource files? (Or can you suggest improvements?)

  • What can I do to solve this issue?

Initial Thoughts:

Initially, because I only saw the issue when the code was deployed in a jar, I thought that is was something to do with the access via JarFile - perhapsthe resource files are being accessed by a shared JarFile, and that when one of those resource input streams are closed, that is closing the JarFile, and that is closing all other open input streams. For example, there's a SO question showing similar behaviour (when the OP was directly handling the JarFiles). Also, there was a similar looking bug report, but that was back in Java 6 and was apparently fixed in Java 7.

Update 1:

After further debugging, this issue appears to be because the XML parser is closing the InputStream when it has finished parsing it. (This seems a bit odd to me - indeed it's prompted these questions in relation to DOM and SAX parsing - but there we go.) As such, my current best guess is that the SAXParser (or actually down in the XMLEntityManager) is calling InputStream.close(), but there is some kind of race condition about the state?

It doesn't appear to be related to the use of try-with-resources - i.e. given that the SAXParser is closing the InputStream, I've tried removing try-with-resources, and I still get the same errors/stack trace.

Update 2:

After a lot more debugging, I have found that the XMLEntityManager$RewindableInputStream is being closed, before it has finished reading the XML file. Interestingly, I only see this in a heavily concurrent environment, but I still see it even if I put locks around all our possible XML resource loading - i.e. where only one XML resource is being read at a time.

The stack trace of where the XMLEntityManager$RewindableInputStream is being closed - before it's finished reading the file - is as follows:

  at java.util.zip.InflaterInputStream.close(InflaterInputStream.java:224)
  at java.util.zip.ZipFile$ZipFileInflaterInputStream.close(ZipFile.java:417)
  at java.io.FilterInputStream.close(FilterInputStream.java:181)
  at sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream.close(JarURLConnection.java:108)
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.close(XMLEntityManager.java:3005)
  at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.close(UTF8Reader.java:674)
  at com.sun.xml.internal.stream.Entity$ScannedEntity.close(Entity.java:422)
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.endEntity(XMLEntityManager.java:1387)
  at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1916)
  at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipSpaces(XMLEntityScanner.java:1629)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$TrailingMiscDriver.next(XMLDocumentScannerImpl.java:1371)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
  at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
  at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:553)
  at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83)

So, at the moment, my best guess (and it is only that) is that there is some niche concurrency bug in the core Java XML file manager / input stream etc. Maybe a result of sync elision, perhaps? (If this is the case, I’m not sure whether this was a pre-existing bug that’s only been revealed by concurrency improvements in Java 8 or a new bug in Java 8.)

(That said, I haven't filed a bug report, as I don't think I have enough to go on to say that there is a bug, or enough information to inform anyone who would go looking for it.)

Work Around:

Given that the issue was from using the core Java XML libraries, I decided to write my own (largely based on StAX). Fortunately, our XML resource files are quite simple and straightforward, so I only needed to implement a fraction of the functionality in the core Java XML parsers.

Update 3:

The above work-around did improve things - as in, it resolved the particular instances of the issue that I was facing. However, subsequent to that, I found that I was still getting cases where an InputStream, from a resource in a JAR, was being closed whilst it was being read. Now the stack trace is like this:

java.lang.IllegalStateException: zip file closed
at java.util.zip.ZipFile.ensureOpen(ZipFile.java:686)
at java.util.zip.ZipFile.access$200(ZipFile.java:60)
at java.util.zip.ZipFile$ZipEntryIterator.hasNext(ZipFile.java:508)
at java.util.zip.ZipFile$ZipEntryIterator.hasMoreElements(ZipFile.java:503)
at java.util.jar.JarFile$JarEntryIterator.hasNext(JarFile.java:253)
at java.util.jar.JarFile$JarEntryIterator.hasMoreElements(JarFile.java:262)

Searching for issues relating to that stack trace led me to this question, and the suggest that I control the URLConnection, so as not to cache the connections so they won't be shared: [URLConnection.setUseCaches(boolean)][6]

As such, I tried this (see the answer below for implementation) and it seemed to be working and stable. I even went back and tried this with my previous core Java StAX parsers, and it all seemed to be working and stable. (Aside, I'm currently undecided as to whether to keep my custom XML parsers - they seem to be a little more performant by virtue of being lighted, but it's a trade-off with the additional maintenance requirements.) So, it's probably not a concurrency bug in the core Java XML parsers, but an issue with the dynamic classloaders in the JVM.

Update 4:

I'm increasingly of the opinion that this is a concurrency bug in core Java, with respect to how it's handling access to resources files, as a stream, from within jars. For example, there is this issue in org.reflections.reflections, which I also encountered.

I've also seen this issue with respect to JBLAS, such that I get the following exception (and the issue raised):

Caused by: java.lang.NullPointerException: Inflater has been closed
at java.util.zip.Inflater.ensureOpen(Inflater.java:389)
at java.util.zip.Inflater.inflate(Inflater.java:257)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.jblas.util.LibraryLoader.loadLibraryFromStream(LibraryLoader.java:261)
at org.jblas.util.LibraryLoader.loadLibrary(LibraryLoader.java:186)
at org.jblas.NativeBlasLibraryLoader.loadLibraryAndCheckErrors(NativeBlasLibraryLoader.java:32)
at org.jblas.NativeBlas.<clinit>(NativeBlas.java:77)
Marianmariana answered 6/4, 2018 at 17:11 Comment(11)
Resource loading is a job of the ClassLoader. May be your deployment environment uses custom class loader(s) that still suffer from this bug?Christianly
Maybe, but I would have thought unlikely. The deployment environment is effectively that the code gets called from a command line, with a specific JRE, with some JVM args, with a classpath and main class and some args - which seems pretty vanilla to me.Marianmariana
Closing one of the entry streams should never close the entire JarFile. Keep in mind that this would also break the entire class loading, as locating and reading the class files works the exactly same way.Parmentier
Is there a limit of files you may open? If so what happens if this number is reached? https://mcmap.net/q/502141/-java-ioexception-quot-too-many-open-files-quot and #16361220 just guessingPornocracy
To what server are you deploying? On what operating system are you running?Distillery
@Distillery Windows and UnixMarianmariana
You get an exception when you try sharing one ZipFile instance between different threads, and then each thread tries to separately close the same ZipFile (you can only close a ZipFile once), or one thread tries to close a ZipFile while another is still trying to read from it. There is actually no sense sharing one ZipFile instance between threads anyway, since it imposes a synchronized lock around all its methods. My library FastClasspathScanner uses one ZipFile instance per thread, and I never run into this issue, so it's not a JDK bug. It's a bug in how the ZipFile API is being called.Aboriginal
@LukeHutchison - you'll note from the beginning of the post that I'm not calling the ZipFile API directly, but rather just accessing resources from a .jar using .getResourceAsStream(). The problem is - I think - in the JDK, where it opens a connection to the resource URL using a JNLPCachedJarURLConnection. You're right - you could resolve this by using a single ZipFile instance - although that would require additional knowledge of which jar the resource is in (which may be reasonable to expect) - but I found using un-cached connections (see my original answer) to be a cleaner replacement.Marianmariana
Sorry - further to that, it's unlikely to be a JNLPCachedJarURLConnection (which is probably for JWS?), but a regular JarURLConnection. But, please note that the URLConnection boolean defaultUseCaches = true.Marianmariana
@Marianmariana I see, thanks for clarifying -- yes, I believe this is how they fixed the bug in Reflections too, by switching off caching. I guess what I was saying is that if you use one ZipFile instance per thread, then all the InputStreams that you get can be cached or not cached, it doesn't matter -- but apparently the .getResourceAsStream() API doesn't expect streams to be shared between threads when they are cached, or something? If so, then yes, I would call that a JDK bug. There should be one cache per thread.Aboriginal
This looks relevant? bugs.openjdk.java.net/browse/JDK-8246714Perceptible
M
2

As I explain in 'Update 3', I found the following to be a viable and stable solution:

try {
  SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
  URLConnection connection = MyClass.class.getResource("resource.xml").openConnection()
  connection.setUseCaches(false);  
  try (InputStream in = connection.getInputStream()) {
    parser.parse(in, new DefaultHandler() {...});
  }
} catch (Exception ex) {
  throw new RuntimeException("Error loading resource.xml", ex);
} 
Marianmariana answered 8/4, 2018 at 11:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.