Anyone know of a good tutorial (or have a good example) for writing XML using the SAX framework (or something similar) and Java? Searching has yielded very little in terms of useful results. I'm trying to export from an Android app and am looking to avoid as much memory overhead as possible.
There's a very useful technique for generating XML directly from POJOs via the SAX framework (not a SAX parser, but the SAX framework). This technique could be used to generate an XML document.
Generating XML from an Arbitrary Data Structure
http://download.oracle.com/javaee/1.4/tutorial/doc/JAXPXSLT5.html
Essentially, you add methods to your POJO or write utility class for your POJOs that turn them into SAX event emitters (emitting events like a SAX parser normally would when parsing an XML document). Now your "SAX event generator" looks like the output side of a SAX parser and can be given any content handler that a SAX parser would take, such as one that pretyy prints XML. But it could also be feed to a DOM parser to generate a DOM tree or feed to an XSLT engine to generate HTML or do a true XSL translation without having to first generate an intermediate XML document from the POJOs.
For example, a Person class might have an emitXML()
method that include these lines:
handler.startElement(nsu, PERSON_TAG, PERSON_TAG, NO_ATTRIBUTES);
handler.startElement(nsu, FIRSTNAME_TAG, FIRSTNAME_TAG, atts);
handler.characters(this.firstName.toCharArray(),
0,
this.firstName.length());
handler.endElement(nsu, FIRSTNAME_TAG, FIRSTNAME_TAG);
... emit more instance variables
... emit child object like: homeAddress.emitXML(handler, ...);
handler.endElement(nsu, PERSON_TAG, PERSON_TAG);
Update:
A couple of other references:
- Transform Legacy Data to XML Using JAXP
http://www.devx.com/java/Article/16925 - Transforming Flat Files To XML With SAX and XSLT
http://www.developer.com//xml/article.php/2108031/Transforming-Flat-Files-To-XML-With-SAX-and-XSLT.htm
A couple of responses to comments:
This is true, but the XMLStreamWriter interface described above is much more user-friendly. – Michael Kay 3 hours ago
Yes, but I guess I wasn't clear. I could easy traverse the hierarchy and use XMLStreamWriter
to directly output an XML document to a stream. However, the articles show a powerful technique to traverse the hierarchy and generate SAX events, instead of outputting an XML document directly. Now I can plug-in different content handlers that do different things or generate different versions of the XML. We could also feed our object hierarchy to any tool that accepted a SAX parser, like an XSLT engine. Its really just taking advantage of the visitor pattern established by the SAX framework: we separate traversing the hierarchy from output the XML. The parts that output the XML, the content handlers, should certainly use an XMLStreamWriter
if their purpose is to write an XML stream.
For example, on our program, we sent XML messages over network sockets between distributed components and we also used XSLT to generate our HTML pages. Previously, we traversed our hierarchy to generate a XML document (a string) and then either wrote that XML document to a network socket or fed that document to the XSLT engine (which essentially just parsed it again). After using this technique, we could essentially feed our object hierarchy (using this SAX adapter) directly to the XSLT engine without needing the intermediate XML string. It was also convenient to be able to use one content handler to generate a compact XML representation for the network stream and use a different one to generate a pretty-printed XML document for writing to a log file.
Besides, using SAX parser API to write XML is a misuse of the API, IMHO. – Puce 49 mins ago
Perhaps, but I think it depends on your needs. If OP's requirement is just to write out an a specific XML document, then this is definitely overkill. However, I thought it worth mentioning if the OP uses XML in other ways on his project that he didn't mention. There's no harm in pitching an alternative idea.
Calling it misuse may be a bit strong, but I agree you're entitled to your opinion. Its documented in an Oracle tutorial, so its not considered abuse by the Sun/Oracle engineers. It was highly successful on our project to help us meet our requirements with no significant downsides, so I'll be keeping this approach in my toolbox for when its useful in the future.
SAX parsing is for reading documents, not writing them.
You can write XML with the XMLStreamWriter:
OutputStream outputStream = new FileOutputStream(new File("doc.xml"));
XMLStreamWriter out = XMLOutputFactory.newInstance().createXMLStreamWriter(
new OutputStreamWriter(outputStream, "utf-8"));
out.writeStartDocument();
out.writeStartElement("doc");
out.writeStartElement("title");
out.writeCharacters("Document Title");
out.writeEndElement();
out.writeEndElement();
out.writeEndDocument();
out.close();
out.print("<doc><tag>value</tag></doc>");
etc. But you have to make sure to escape the values properly. This could be done with apache commons StringEscapeUtils.escapeXML()
. Or some other method. It depends on the possible values, maybe you could just do it with regex. –
Upkeep There's a very useful technique for generating XML directly from POJOs via the SAX framework (not a SAX parser, but the SAX framework). This technique could be used to generate an XML document.
Generating XML from an Arbitrary Data Structure
http://download.oracle.com/javaee/1.4/tutorial/doc/JAXPXSLT5.html
Essentially, you add methods to your POJO or write utility class for your POJOs that turn them into SAX event emitters (emitting events like a SAX parser normally would when parsing an XML document). Now your "SAX event generator" looks like the output side of a SAX parser and can be given any content handler that a SAX parser would take, such as one that pretyy prints XML. But it could also be feed to a DOM parser to generate a DOM tree or feed to an XSLT engine to generate HTML or do a true XSL translation without having to first generate an intermediate XML document from the POJOs.
For example, a Person class might have an emitXML()
method that include these lines:
handler.startElement(nsu, PERSON_TAG, PERSON_TAG, NO_ATTRIBUTES);
handler.startElement(nsu, FIRSTNAME_TAG, FIRSTNAME_TAG, atts);
handler.characters(this.firstName.toCharArray(),
0,
this.firstName.length());
handler.endElement(nsu, FIRSTNAME_TAG, FIRSTNAME_TAG);
... emit more instance variables
... emit child object like: homeAddress.emitXML(handler, ...);
handler.endElement(nsu, PERSON_TAG, PERSON_TAG);
Update:
A couple of other references:
- Transform Legacy Data to XML Using JAXP
http://www.devx.com/java/Article/16925 - Transforming Flat Files To XML With SAX and XSLT
http://www.developer.com//xml/article.php/2108031/Transforming-Flat-Files-To-XML-With-SAX-and-XSLT.htm
A couple of responses to comments:
This is true, but the XMLStreamWriter interface described above is much more user-friendly. – Michael Kay 3 hours ago
Yes, but I guess I wasn't clear. I could easy traverse the hierarchy and use XMLStreamWriter
to directly output an XML document to a stream. However, the articles show a powerful technique to traverse the hierarchy and generate SAX events, instead of outputting an XML document directly. Now I can plug-in different content handlers that do different things or generate different versions of the XML. We could also feed our object hierarchy to any tool that accepted a SAX parser, like an XSLT engine. Its really just taking advantage of the visitor pattern established by the SAX framework: we separate traversing the hierarchy from output the XML. The parts that output the XML, the content handlers, should certainly use an XMLStreamWriter
if their purpose is to write an XML stream.
For example, on our program, we sent XML messages over network sockets between distributed components and we also used XSLT to generate our HTML pages. Previously, we traversed our hierarchy to generate a XML document (a string) and then either wrote that XML document to a network socket or fed that document to the XSLT engine (which essentially just parsed it again). After using this technique, we could essentially feed our object hierarchy (using this SAX adapter) directly to the XSLT engine without needing the intermediate XML string. It was also convenient to be able to use one content handler to generate a compact XML representation for the network stream and use a different one to generate a pretty-printed XML document for writing to a log file.
Besides, using SAX parser API to write XML is a misuse of the API, IMHO. – Puce 49 mins ago
Perhaps, but I think it depends on your needs. If OP's requirement is just to write out an a specific XML document, then this is definitely overkill. However, I thought it worth mentioning if the OP uses XML in other ways on his project that he didn't mention. There's no harm in pitching an alternative idea.
Calling it misuse may be a bit strong, but I agree you're entitled to your opinion. Its documented in an Oracle tutorial, so its not considered abuse by the Sun/Oracle engineers. It was highly successful on our project to help us meet our requirements with no significant downsides, so I'll be keeping this approach in my toolbox for when its useful in the future.
Below answers "a good tutorial for writing XML using the SAX parser and Java" part of question
I am not sure if you have gone through this. But I really like Java's Really Big Index of Everything.
Go through this: http://download.oracle.com/javase/tutorial/jaxp/index.html
And eventually, this: http://download.oracle.com/javase/tutorial/jaxp/sax/index.html
Please refer to my personal blog post: XML Generation In Java - specifically, The SAX method. It references a few other articles concerning this, provides a concrete example, and compares SAX with the other popular APIs for generating XML from Java.
(Realized this is an older question, but felt it necessary to add this for anyone else that may have the same question.)
You can also bridge to trax with this:
public abstract class PipedSAXSource extends SAXSource {
protected PipedSAXSource() {
setXMLReader(new CallWriteDuringSax());
}
protected abstract void writeTo(ContentHandler sink)
throws IOException, SAXException;
private class CallWriteDuringSax extends XMLFilterImpl {
@Override
public void parse(InputSource ignored) throws IOException, SAXException {
writeTo(getContentHandler());
}
@Override
public void setFeature(String name, boolean value) {}
}
}
Use like so:
public static void main(String[] args) throws Exception {
Source in = new PipedSAXSource() {
@Override
protected void writeTo(ContentHandler sink) throws SAXException {
sink.startDocument();
sink.startElement("", "root", "root", new AttributesImpl());
sink.endElement("", "root", "root");
sink.endDocument();
}
};
Transformer identity = TransformerFactory.newInstance().newTransformer();
identity.transform(in, new StreamResult(System.out));
}
© 2022 - 2024 — McMap. All rights reserved.