Convert Java w3c Document to XMLStreamReader
Asked Answered
S

5

6

I would like to reuse some existing code in our code base that accepts an XMLStreamReader my application has the required data as a w3c Document.

The following example is a minimum test case:

public static void main(String[] args) throws Exception {
    DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = builderFactory.newDocumentBuilder();

    Document doc = builder.newDocument();

    Element rootElement = doc.createElement("Groups");
    doc.appendChild(rootElement);
    Element group = doc.createElement("Group");
    group.setTextContent("Wibble");
    rootElement.appendChild(group);

    DOMSource source = new DOMSource(doc);

    XMLStreamReader reader = XMLInputFactory.newInstance().createXMLStreamReader(source);

    reader.nextTag();
    System.out.println("NextTag:" + reader.getName());
}

The expected output should be something like: NextTag:Groups but instead the following is thrown:

Exception in thread "main" javax.xml.stream.XMLStreamException: java.net.MalformedURLException
    at com.sun.xml.stream.XMLReaderImpl.setInputSource(XMLReaderImpl.java:196)
    at com.sun.xml.stream.XMLReaderImpl.<init>(XMLReaderImpl.java:179)
    at com.sun.xml.stream.ZephyrParserFactory.createXMLStreamReader(ZephyrParserFactory.java:139)
    at Main.main(Main.java:27)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: java.net.MalformedURLException
    at java.net.URL.<init>(URL.java:601)
    at java.net.URL.<init>(URL.java:464)
    at java.net.URL.<init>(URL.java:413)
    at com.sun.xml.stream.XMLEntityManager.startEntity(XMLEntityManager.java:762)
    at com.sun.xml.stream.XMLEntityManager.startDocumentEntity(XMLEntityManager.java:697)
    at com.sun.xml.stream.XMLDocumentScannerImpl.setInputSource(XMLDocumentScannerImpl.java:300)
    at com.sun.xml.stream.XMLReaderImpl.setInputSource(XMLReaderImpl.java:193)
    ... 8 

Currently using Java 6 update 22.

Further info: The source to ZephyrParserFactory#jaxpSourcetoXMLInputSource seems to indicate that the Source object is converted by coping it's SystemId rather than the actual contents of the DOMSource.

Update: My orignal test case above was actually run using my project classpath which actually includes the JAXB 2.2.1 library which in turn pulls in sjsxp 1.0.1. Running on a clean classpath yields:

Exception in thread "main" java.lang.UnsupportedOperationException: Cannot create XMLStreamReader or XMLEventReader from a javax.xml.transform.dom.DOMSource
    at com.sun.xml.internal.stream.XMLInputFactoryImpl.jaxpSourcetoXMLInputSource(XMLInputFactoryImpl.java:302)
    at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:145)

Which fits with @Gary Rowe's answer.

Schulz answered 31/8, 2011 at 13:23 Comment(5)
Is it trying to download the xsd?Middling
nope..the xml is more or less what you have above: <Groups><Group>Wibble</Group></Groups>Schulz
just brainstorming. Shouldn't you use source.getSystemId() inside createXMLStreamReader?Manage
createXmlStreamReader does use getSystemId that is I think the root of my pain. DOMSource doesn't have a systemId.Schulz
I'm looking for the opposite: XMLStreamReader to Document :-(Firefly
J
2

It's somewhat convoluted, but any XQuery implementation that supports the XQJ API (for example Saxon) will allow you to supply a DOM as the input to the query ".", and get the result as an XMLStreamReader. Although there's a lot of heavyweight machinery involved, it should be perfectly efficient.

With Saxon you could also short-circuit the XQuery side of things using something like

Document doc; // the DOM document
XMLStreamReader reader = new PullToStax(PullProvider.makePullProvider(new DocumentWrapper(doc));

but I think the XQJ approach is cleaner.

Jamiejamieson answered 31/8, 2011 at 14:34 Comment(0)
D
2

Woodstox provides exactly what you need with its WstxDOMWrappingReader class. See the Javadoc at https://fasterxml.github.io/woodstox/javadoc/5.0/com/ctc/wstx/dom/WstxDOMWrappingReader.html

Small example:

  DOMSource domSource = new DOMSource(node);
  ReaderConfig config = ReaderConfig.createFullDefaults();
  XMLStreamReader reader = WstxDOMWrappingReader.createFrom(domSource, config);
Divertissement answered 15/3, 2016 at 10:11 Comment(0)
M
1

Seems to me that a DOMSource is not an instance of a StreamSource so it's getting kicked out.

Micmac answered 31/8, 2011 at 14:15 Comment(0)
S
1

My pragmatic solution has been to output the Document to a byte array using ByteArrayOutputStream and then feed that back in using ByteArrayInputStream

Transformer xformer = TransformerFactory.newInstance().newTransformer();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
StreamResult out = new StreamResult(outputStream);
xformer.transform(source, out);
reader = xmlInputFactory.createXMLStreamReader(new ByteArrayInputStream(outputStream.toByteArray()));

It's not pretty but it works.

Schulz answered 1/9, 2011 at 7:54 Comment(1)
Not only is it unpretty, it also involves serialising and reparsing the document which can be very expensive in both time and memory usage.Jamiejamieson
M
1

I run into the same error (Windows 7/Oracle JDK 7) using the following code:

DOMSource domSource = new DOMSource(element);
XMLEventReader parser = XMLInputFactory.newInstance().createXMLEventReader(domSource);

I fixed it by adding a new Woodstox dependency:

<dependency>
    <groupId>org.codehaus.woodstox</groupId>
    <artifactId>woodstox-core-lgpl</artifactId>
    <version>4.1.5</version>
</dependency>

But this is a nasty solution as well.

Modiolus answered 13/10, 2013 at 16:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.