XML without namespace. Validate against one of several XSD's
Asked Answered
A

1

6

I have a situation where we receive a bunch of XML files on a regular basis. We have no control over them, and they do not have namespace information, and we would really like to avoid changing them.

We have an XSD which we need to use to validate the XML files, and which works if explicitly coded to be applied. Now we would like to hint to a SAX parser that this particular XML dialect should be validated against this XSD (which we have on the file system), but I cannot find any other way than providing a noNamespaceSchemaLocation in the XML file which we really would like to avoid.

Suggestions? Will an EntityResolver always be called with a null/empty namespace?

(a functional solution will give 500 bonus points when I am allowed to)

Apodaca answered 16/9, 2015 at 15:3 Comment(2)
in the title you speak of several XSDs, but it is really a single fixed XSD which should be used to validate the incoming files?Destructionist
I have currently two types but only one is important. Both have no name spaces, but the interesting one has a <NewsML> root tag for which I have an XSD:Magistrate
D
4

Using java.xml.validation you can specify the XSD schema which should be used to validate a XML document without being referenced by the document:

import javax.xml.XMLConstants;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;

...
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(new File("<path to the xsd>"));

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setValidating(false);
spf.setSchema(schema);

XMLReader xmlReader = spf.newSAXParser().getXMLReader();
xmlReader.setContentHandler(...);
xmlReader.parse(new InputSource(...)); // will validate against the schema

Note that setValidating() only means to turn off DTD validation as defined by the W3C. That call would not strictly be necessary since the default is false.

Destructionist answered 16/9, 2015 at 16:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.