I'm in the process of migrating our Java codebase from Java 7 (80) to Java 8 (162). (Yep... we're at the cutting edge of technology.)
After the switchover, I've been having issues when loading XML resource files from deployed jars in a heavily concurrent environment. The resource files are being accessed using try-with-resources
and parsed via SAX:
try {
SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
try (InputStream in = MyClass.class.getResourceAsStream("resource.xml")) {
parser.parse(in, new DefaultHandler() {...});
}
} catch (Exception ex) {
throw new RuntimeException("Error loading resource.xml", ex);
}
Please correct me if I'm wrong, but this seems like the approach that is generally advised for reading resource files.
This works fine in an IDE, but once it has been deployed in a jar, I'm frequently (but not universally, and not always with the same resource file) getting an IOException
, with the following stack trace:
Caused by: java.io.IOException: Stream closed
at java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:67)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:142)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(XMLEntityManager.java:2919)
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:302)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1895)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanName(XMLEntityScanner.java:728)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1279)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:195)
Questions:
What's going on here?
Am I doing something wrong, with how I'm reading/parsing these resource files? (Or can you suggest improvements?)
What can I do to solve this issue?
Initial Thoughts:
Initially, because I only saw the issue when the code was deployed in a jar, I thought that is was something to do with the access via JarFile
- perhapsthe resource files are being accessed by a shared JarFile
, and that when one of those resource input streams are closed, that is closing the JarFile
, and that is closing all other open input streams. For example, there's a SO question showing similar behaviour (when the OP was directly handling the JarFile
s). Also, there was a similar looking bug report, but that was back in Java 6 and was apparently fixed in Java 7.
Update 1:
After further debugging, this issue appears to be because the XML parser is closing the InputStream
when it has finished parsing it. (This seems a bit odd to me - indeed it's prompted these questions in relation to DOM and SAX parsing - but there we go.) As such, my current best guess is that the SAXParser
(or actually down in the XMLEntityManager
) is calling InputStream.close()
, but there is some kind of race condition about the state?
It doesn't appear to be related to the use of try-with-resources - i.e. given that the SAXParser is closing the InputStream, I've tried removing try-with-resources, and I still get the same errors/stack trace.
Update 2:
After a lot more debugging, I have found that the XMLEntityManager$RewindableInputStream
is being closed, before it has finished reading the XML file. Interestingly, I only see this in a heavily concurrent environment, but I still see it even if I put locks around all our possible XML resource loading - i.e. where only one XML resource is being read at a time.
The stack trace of where the XMLEntityManager$RewindableInputStream is being closed - before it's finished reading the file - is as follows:
at java.util.zip.InflaterInputStream.close(InflaterInputStream.java:224)
at java.util.zip.ZipFile$ZipFileInflaterInputStream.close(ZipFile.java:417)
at java.io.FilterInputStream.close(FilterInputStream.java:181)
at sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream.close(JarURLConnection.java:108)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.close(XMLEntityManager.java:3005)
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.close(UTF8Reader.java:674)
at com.sun.xml.internal.stream.Entity$ScannedEntity.close(Entity.java:422)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.endEntity(XMLEntityManager.java:1387)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1916)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipSpaces(XMLEntityScanner.java:1629)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$TrailingMiscDriver.next(XMLDocumentScannerImpl.java:1371)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:553)
at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83)
So, at the moment, my best guess (and it is only that) is that there is some niche concurrency bug in the core Java XML file manager / input stream etc. Maybe a result of sync elision, perhaps? (If this is the case, I’m not sure whether this was a pre-existing bug that’s only been revealed by concurrency improvements in Java 8 or a new bug in Java 8.)
(That said, I haven't filed a bug report, as I don't think I have enough to go on to say that there is a bug, or enough information to inform anyone who would go looking for it.)
Work Around:
Given that the issue was from using the core Java XML libraries, I decided to write my own (largely based on StAX). Fortunately, our XML resource files are quite simple and straightforward, so I only needed to implement a fraction of the functionality in the core Java XML parsers.
Update 3:
The above work-around did improve things - as in, it resolved the particular instances of the issue that I was facing. However, subsequent to that, I found that I was still getting cases where an InputStream, from a resource in a JAR, was being closed whilst it was being read. Now the stack trace is like this:
java.lang.IllegalStateException: zip file closed
at java.util.zip.ZipFile.ensureOpen(ZipFile.java:686)
at java.util.zip.ZipFile.access$200(ZipFile.java:60)
at java.util.zip.ZipFile$ZipEntryIterator.hasNext(ZipFile.java:508)
at java.util.zip.ZipFile$ZipEntryIterator.hasMoreElements(ZipFile.java:503)
at java.util.jar.JarFile$JarEntryIterator.hasNext(JarFile.java:253)
at java.util.jar.JarFile$JarEntryIterator.hasMoreElements(JarFile.java:262)
Searching for issues relating to that stack trace led me to this question, and the suggest that I control the URLConnection
, so as not to cache the connections so they won't be shared: [URLConnection.setUseCaches(boolean)][6]
As such, I tried this (see the answer below for implementation) and it seemed to be working and stable. I even went back and tried this with my previous core Java StAX parsers, and it all seemed to be working and stable. (Aside, I'm currently undecided as to whether to keep my custom XML parsers - they seem to be a little more performant by virtue of being lighted, but it's a trade-off with the additional maintenance requirements.) So, it's probably not a concurrency bug in the core Java XML parsers, but an issue with the dynamic classloaders in the JVM.
Update 4:
I'm increasingly of the opinion that this is a concurrency bug in core Java, with respect to how it's handling access to resources files, as a stream, from within jars. For example, there is this issue in org.reflections.reflections, which I also encountered.
I've also seen this issue with respect to JBLAS, such that I get the following exception (and the issue raised):
Caused by: java.lang.NullPointerException: Inflater has been closed
at java.util.zip.Inflater.ensureOpen(Inflater.java:389)
at java.util.zip.Inflater.inflate(Inflater.java:257)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.jblas.util.LibraryLoader.loadLibraryFromStream(LibraryLoader.java:261)
at org.jblas.util.LibraryLoader.loadLibrary(LibraryLoader.java:186)
at org.jblas.NativeBlasLibraryLoader.loadLibraryAndCheckErrors(NativeBlasLibraryLoader.java:32)
at org.jblas.NativeBlas.<clinit>(NativeBlas.java:77)
JarFile
. Keep in mind that this would also break the entire class loading, as locating and reading the class files works the exactly same way. – Parmentier