I'm using the following code to create a DOMDocument and validate it against an external xsd file.
<?php
$xmlPath = "/xml/some/file.xml";
$xsdPath = "/xsd/some/schema.xsd";
$doc = new \DOMDocument();
$doc->loadXML(file_get_contents($xmlPath), LIBXML_NOBLANKS);
if (!$doc>schemaValidate($xsdPath)) {
throw new InvalidXmlFileException();
}
Update 2 (rewritten question)
This works fine, meaning that if the XML doesn't match the definitions of XSD it will throw a meaningful exception.
Now, I want to retrieve information from the DOMDocument using Xpath. It works fine aswell, however, from this point on the DOMDocument is completely detached from the XSD! For example, if I have a DOMNode I cannot know whether it is of type simpleType or type complexType. I can check whether the node has child (hasChild()) nodes, but this is not the same. Also, there is tons of information more in the XSD (like, min and max number of occurrence, etc).
The question really is, do I have to query the XSD myself or is there a programmatic way of asking those kind of questions. I.e. is this DOMNode a complex or simple type?
In another post it was suggested "to process the schema using a real schema processor, and then use its API to ask questions about the contents of the schema". Does XPath has an API to retrieve information of the XSD or is there a different convenient way with DOMDocument?
For the record, the original question
Now, I wanted to proceed to parse information from the DOMDocument using XPath. To increase the integrity of my data I'm storing to a database and giving meaningful error message to the client I wanted to constantly use the schema information to validate the queries. I.e. I wanted to validate fetched childNodes against allowed child nodes defined in the xsd. I wanted to that by using XPath on the xsd document.
However, I sumbled across this post. It basically sais it is a kind of kirky way to that yourself and you should rather use a real schema processor and use its API to make the queries. If I understand that right, I'm using a real schema processor with schemaValidate
, but what is meant by using its API?
I kind of guessed already I'm not using the schema in a correct way, but I have no idea how to research a proper usage.
The question
If I use schemaValidate
on the DOMDocument is that a one-time validation (true or false) or is it tied to the DOMDocument for longer then? Precisely, can I use the validation also for adding nodes somehow or can I use it to select nodes I'm interested in as suggested by the referenced SO post?
Update
The question was rated unclear, so I want to try again. Say I would like to add a node or edit a node value. Can I use the schema provided in the xsd so that I can validate the user input? Originally, in order to do that I wanted to query the xsd manually with another XPath instance to get the specs for a certain node. But as suggested in the linked article this is not best practice. So the question would be, does the DOM lib offer any API to make such a validation?
Maybe I'm overthinking it. Maybe I just add the node and run the validation again and see where/why it breaks? In that case, the answer of the custom error handling would be correct. Can you confirm?