I wrote a quick xml file that gets the line numbers and throws an exception in the case of an unwanted attribute and gives the text where the error was thrown.
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.Stack;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.apache.log4j.Logger;
import org.w3c.dom.Document;
import org.xml.sax.Attributes;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class LocatorTestSAXReader {
private static final Logger logger = Logger.getLogger(LocatorTestSAXReader.class);
private static final String XML_FILE_PATH = "lib/xml/test-instance1.xml";
public Document readXMLFile(){
Document doc = null;
SAXParser parser = null;
SAXParserFactory saxFactory = SAXParserFactory.newInstance();
try {
parser = saxFactory.newSAXParser();
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
doc = docBuilder.newDocument();
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
StringBuilder text = new StringBuilder();
DefaultHandler eleHandler = new DefaultHandler(){
private Locator locator;
@Override
public void characters(char[] ch, int start, int length){
String thisText = new String(ch, start, length);
if(thisText.matches(".*[a-zA-z]+.*")){
text.append(thisText);
logger.debug("element text: " + thisText);
}
}
@Override
public void setDocumentLocator(Locator locator){
this.locator = locator;
}
@Override
public void startElement(final String uri, final String localName, final String qName,
final Attributes attributes)
throws SAXException {
int lineNum = locator.getLineNumber();
logger.debug("I am now on line " + lineNum + " at element " + qName);
int len = attributes.getLength();
for(int i=0;i<len;i++){
String attVal = attributes.getValue(i);
String attName = attributes.getQName(i);
logger.debug("att " + attName + "=" + attVal);
if(attName.startsWith("bad")){
throw new SAXException("found attr : " + attName + "=" + attVal + " that starts with bad! at line : " +
locator.getLineNumber() + " at element " + qName + "\nelement occurs below text : " + text);
}
}
}
};
try {
parser.parse(new FileInputStream(new File(XML_FILE_PATH)), eleHandler);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return doc;
}
}
with regards to the text, depending on where in the xml file the error occurs, there may not be any text. So with this xml:
<?xml version="1.0"?>
<root>
<section>
<para>This is a quick doc to test the ability to get line numbers via the Locator object. </para>
</section>
<section bad:attr="ok">
<para>another para.</para>
</section>
</root>
if the bad attr is in the first element the text will be blank. In this case, the exception thrown was:
org.xml.sax.SAXException: found attr : bad:attr=ok that starts with bad! at line : 6 at element section
element occurs below text : This is a quick doc to test the ability to get line numbers via the Locator object.
When you say you tried using the Locator object, what exactly was the problem?
XMLScanner
class (in Batik) that may work, but I never got around to trying it. xmlgraphics.apache.org/batik/javadoc/org/apache/batik/xml/… – Hunger<
is illegal inside an element, you can then look backwards from that position for the start of the element, and then lex it from there. I don't know why they didn't simply put the character location in the element when it was parsed! – Viipuri