How to parse a XML having data included in <![CDATA[---]...
how can we parse the xml and get the data included in CDATA
???
public static void main(String[] args) throws Exception {
File file = new File("data.xml");
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
//if you are using this code for blackberry xml parsing
builder.setCoalescing(true);
Document doc = builder.parse(file);
NodeList nodes = doc.getElementsByTagName("topic");
for (int i = 0; i < nodes.getLength(); i++) {
Element element = (Element) nodes.item(i);
NodeList title = element.getElementsByTagName("title");
Element line = (Element) title.item(0);
System.out.println("Title: " + getCharacterDataFromElement(line));
}
}
public static String getCharacterDataFromElement(Element e) {
Node child = e.getFirstChild();
if (child instanceof CharacterData) {
CharacterData cd = (CharacterData) child;
return cd.getData();
}
return "";
}
( http://www.java2s.com/Code/Java/XML/GetcharacterdataCDATAfromxmldocument.htm )
DocumentBuilderFactory
? –
Ricker e.getTextContent()
. See example without type check, cast, e.getData()
. –
Pouch Since all previous answers are using a DOM based approach. This is how to parse CDATA with a stream based approach using STAX.
Use the following pattern:
switch (EventType) {
case XMLStreamConstants.CHARACTERS:
case XMLStreamConstants.CDATA:
System.out.println(r.getText());
break;
default:
break;
}
Complete sample:
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;
public void readCDATAFromXMLUsingStax() {
String yourSampleFile = "/path/toYour/sample/file.xml";
XMLStreamReader r = null;
try (InputStream in =
new BufferedInputStream(new FileInputStream(yourSampleFile));) {
XMLInputFactory factory = XMLInputFactory.newInstance();
r = factory.createXMLStreamReader(in);
while (r.hasNext()) {
switch (r.getEventType()) {
case XMLStreamConstants.CHARACTERS:
case XMLStreamConstants.CDATA:
System.out.println(r.getText());
break;
default:
break;
}
r.next();
}
} catch (Exception e) {
throw new RuntimeException(e);
} finally {
if (r != null) {
try {
r.close();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
}
With /path/toYour/sample/file.xml
<data>
<![CDATA[ Sat Nov 19 18:50:15 2016 (1672822)]]>
<![CDATA[Sat, 19 Nov 2016 18:50:14 -0800 (PST)]]>
</data>
Gives:
Sat Nov 19 18:50:15 2016 (1672822)
Sat, 19 Nov 2016 18:50:14 -0800 (PST)
CDATA
just says that the included data should not be escaped. So, just take the tag text. XML parser should return the clear data without CDATA
.
here r.get().getResponseBody()
is the response body
Document doc = getDomElement(r.get().getResponseBody());
NodeList nodes = doc.getElementsByTagName("Title");
for (int i = 0; i < nodes.getLength(); i++) {
Element element = (Element) nodes.item(i);
NodeList title = element.getElementsByTagName("Child tag where cdata present");
Element line = (Element) title.item(0);
System.out.println("Title: "+ getCharacterDataFromElement(line));
public static Document getDomElement(String xml) {
Document doc = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setCoalescing(true);
dbf.setNamespaceAware(true);
try {
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xml));
doc = db.parse(is);
} catch (Exception e) {
e.printStackTrace();
}
return doc;
}
public static String getCharacterDataFromElement(Element e) {
Node child = e.getFirstChild();
if (child instanceof CharacterData) {
CharacterData cd = (CharacterData) child;
return cd.getData();
}
return "";
}
Below is the sample XML file and the code to retrieve the XML embedded in the the CDATA within main xml.
<envelope>
<Header>
<id>123</id>
<name>abc</name>
</Header>
<payload>
<![CDATA[<?xml> <Document><validXML></validXML></Document>]]>
</payload>
</envelope>
Xpath to get the CDATA XML given in above example would be
/envelope/payload/text()
So, once you have the root Document of above xml, with the given Path you can fetch the xml embedded in the CDATA.
Below is the utility method for the same.
public String getSubDocument(Document rootDocument, String xPathString) throws Exception {
XPath xPath = XPathFactory.newInstance().newXPath();
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document rootDoc = builder.newDocument();
String xmlString = (String)xPath.compile(xPathString).evaluate(rootDocument, XPathConstants.String);
return xmlString;
}
}
© 2022 - 2024 — McMap. All rights reserved.