Difference in DocumentBuilder.parse when using JRE 1.5 and JDK 1.6
Asked Answered
P

1

6

Recently at last we have switched our projects to Java 1.6. When executing the tests I found out that using 1.6 a SAXParseException is not thrown which has been thrown using 1.5.

Below is my test code to demonstrate the problem.

import java.io.StringReader;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.SchemaFactory;

import org.junit.Test;
import org.xml.sax.InputSource;
import org.xml.sax.SAXParseException;


/**
 * Test class to demonstrate the difference between JDK 1.5 to JDK 1.6.
 * 
 * Seen on Linux:
 * 
 * <pre>
 * #java version "1.6.0_18"
 * Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
 * Java HotSpot(TM) Server VM (build 16.0-b13, mixed mode)
 * </pre>
 * 
 * Seen on OSX:
 * 
 * <pre>
 * java version "1.6.0_17"
 * Java(TM) SE Runtime Environment (build 1.6.0_17-b04-248-10M3025)
 * Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01-101, mixed mode)
 * </pre>
 * 
 * @author dhiller (creator)
 * @author $Author$ (last editor)
 * @version $Revision$
 * @since 12.03.2010 11:32:31
 */
public class TestXMLValidation {

  /**
   * Tests the schema validation of an XML against a simple schema.
   * 
   * @throws Exception
   *           Falls ein Fehler auftritt
   * @throws junit.framework.AssertionFailedError
   *           Falls eine Unit-Test-Pruefung fehlschlaegt
   */
  @Test(expected = SAXParseException.class)
  public void testValidate() throws Exception {
    final StreamSource schema = new StreamSource( new StringReader( "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
      + "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" "
      + "elementFormDefault=\"qualified\" xmlns:xsd=\"undefined\">" + "<xs:element name=\"Test\"/>" + "</xs:schema>" ) );
    final String xml = "<Test42/>";
    final DocumentBuilderFactory newFactory = DocumentBuilderFactory.newInstance();
    newFactory.setSchema( SchemaFactory.newInstance( "http://www.w3.org/2001/XMLSchema" ).newSchema( schema ) );
    final DocumentBuilder documentBuilder = newFactory.newDocumentBuilder();
    documentBuilder.parse( new InputSource( new StringReader( xml ) ) );
  }

}

When using a JVM 1.5 the test passes, on 1.6 it fails with "Expected exception SAXParseException".

The Javadoc of the DocumentBuilderFactory.setSchema(Schema) Method says:

When errors are found by the validator, the parser is responsible to report them to the user-specified ErrorHandler (or if the error handler is not set, ignore them or throw them), just like any other errors found by the parser itself. In other words, if the user-specified ErrorHandler is set, it must receive those errors, and if not, they must be treated according to the implementation specific default error handling rules.

The Javadoc of the DocumentBuilder.parse(InputSource) method says:

BTW: I tried setting an error handler via setErrorHandler, but there still is no exception.

Now my question:

What has changed to 1.6 that prevents the schema validation to throw a SAXParseException? Is it related to the schema or to the xml that I tried to parse?

Update:

The following code works on 1.5 and 1.6 as I've been desiring:

  @Test(expected = SAXParseException.class)
  public void testValidate() throws Exception {
    final StreamSource schema = new StreamSource( new StringReader( "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
      + "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" "
      + "elementFormDefault=\"qualified\" xmlns:xsd=\"undefined\">" + "<xs:element name=\"Test\"/>" + "</xs:schema>" ) );
    final String xml = "<Test42/>";
    final DocumentBuilderFactory newFactory = DocumentBuilderFactory.newInstance();
    final Schema newSchema = SchemaFactory.newInstance( "http://www.w3.org/2001/XMLSchema" ).newSchema( schema );
    newFactory.setSchema( newSchema );
    final Validator newValidator = newSchema.newValidator();
    final Source is = new StreamSource( new StringReader( xml ) );
    try {
      newValidator.validate( ( Source ) is );
    }
    catch ( Exception e ) {
      e.printStackTrace();
      throw e;
    }
    final DocumentBuilder documentBuilder = newFactory.newDocumentBuilder();
    documentBuilder.parse( new InputSource( new StringReader( xml ) ) );
  }

The solution seems to be to explicitly using a Validator instance created from the Schema instance. I've found the solution here

Still I'm not sure why that is...

Pyrogallate answered 12/3, 2010 at 12:2 Comment(0)
A
1

Apparently, a document not complying with the schema only merits a mild rebuke on stderr from the default error handler. My solution was to replace the default error handler with a stricter one:

// builder is my DocumentBuilder
builder.setErrorHandler(new ErrorHandler() {
    @Override
    public void error(SAXParseException arg0) throws SAXException {
        throw arg0;             
    }

    @Override
    public void fatalError(SAXParseException arg0) throws SAXException {
        throw arg0;                 
    }

    @Override
    public void warning(SAXParseException arg0) throws SAXException {
        throw arg0;                 
    }
});
Afroasiatic answered 31/5, 2011 at 21:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.