How to speed up the XML DTD validation with PHP?
Asked Answered
B

1

9

I am walidating my XML with a DTD file I have locally.

For that, I am doing:

$xml                = $dmsMerrin.'/xml/'.$id.'/conversion.xml';
$dtd                = $dmsMerrin.'/style_files/journalpublishing.dtd';

$dom = new DOMDocument();
@$dom->load($xml);

libxml_use_internal_errors(true);

if (@$dom->validate()) {
    $htmlDTDError .= "<h2>No Errors Found - The tested file is Valid !</h2>";
} 
else {
    $errors = libxml_get_errors();
    $htmlDTDError .= '<h2>Errors Found ('.count($errors).')</h2><ol>';

    foreach ($errors as $error) {
        $htmlDTDError .= '<li>'.$error->message.' on line '.$error->line. '</li>';
    }

    $htmlDTDError .= '</ol>';
    libxml_clear_errors();
}

libxml_use_internal_errors(false);

And this takes about 30sec for an XML with 1600 lines.

Is this a usual time? Should be much faster in my opinion?

As you can see, the DTD I am using is locally on the server.

Any idea? Thank you.

EDIT: By debuging and checking the execution time, I noticed that it takes the same time if my xml has 1600 lines or 150 lines, so the problem is not the xml size.

Berners answered 18/2, 2014 at 16:29 Comment(3)
I can't see where you're actually using your dtd?Ranie
Ah yes, I am not using it in fact, the DTD is setup in the xml file itself. Should I use it in the PHP code ? How ?Berners
"As you can see, the DTD I am using is locally on the server." I can't see that from your example. Is $dmsMerrin a file:/ URL?Sices
S
2

And this takes about 30sec for an XML with 1600 lines.

That's an unusually long time, and it's likely due to misconfiguration.

By debuging and checking the execution time, I noticed that it takes the same time if my xml has 1600 lines or 150 lines, so the problem is not the xml size.

For a tool that may provide more diagnostics here, try xmllint --valid. It will show, for example, errors for any DTDs that could not be retrieved.

It's very likely that the extra time is due to fetching resources, such as the DTD, needed to perform validation.

For one of your files, confirm that the URL of the DTD can be retrieved quickly by testing with a tool like curl from the same server. Is it a complex DTD? Does it bring in other files? Especially, make sure that it never refers to resources that would have to be fetched from the web, or with hostnames where DNS resolves slowly.

Sices answered 27/2, 2014 at 4:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.