I ran into this problem the other day, it turns out the reason for this is the CHaracters method is being called multiple times in case any of these Characters are contained in the Value:
" "
' '
< <
> >
& &
Also be careful about Linebreaks / newlines within the value!!!
If the xml is linewrapped without your controll the characters method wil also be called for each line that is in the statement, plus it will return the linebreak! (which you manually need to strip out in turn).
A sample Handler taking care of all these problems is this one:
DefaultHandler handler = new DefaultHandler() {
private boolean isInANameTag = false;
private String localname;
private StringBuilder elementContent;
@Override
public void startElement(String uri, String localName,String qName, Attributes attributes) throws SAXException {
if (qname.equalsIgnoreCase("myfield")) {
isInMyTag = true;
this.localname = localname;
this.elementContent = new StringBuilder();
}
}
public void characters(char[] buffer, int start, int length) {
if (isInMyTag) {
String content = new String(ch, start, length);
if (StringUtils.equals(content.substring(0, 1), "\n")) {
// remove leading newline
elementContent.append(content.substring(1));
} else {
elementContent.append(content);
}
}
}
public void endElement(String uri, String localName, String qName) throws SAXException {
if (qname.equalsIgnoreCase("myfield")) {
isInMyTag = false;
// do something with elementContent.toString());
System.out.println(elementContent.toString());
this.localname = "";
}
}
}