DOMDocument: Ignore Duplicate Element IDs
Asked Answered
M

3

8

I'm putting some page content (which has been run through Tidy, but doesn't need to be if this is a source of problems) into DOMDocument using DOMDocument::loadHTML.

It's coming up with various errors:

ID x already defined in Entity, line X

Is there any way to make either DOMDocument (or Tidy) ignore or strip out duplicate element IDs, so it will actually create the DOMDocument?

Thanks. :)

Marillin answered 6/1, 2009 at 9:30 Comment(0)
A
13

A quick search on the subject reveals this (incorrect) bug report:

http://bugs.php.net/bug.php?id=46136

The last reply states the following:

You're using HTML 4 rules to load an XHTML document. Either use the load() method to parse as XML or the libxml_use_internal_errors() function to ignore the warnings.

I can't be sure if you are encountering this problem for the same reasons, since you did not include a reference to the HTML page being loaded. In any case, using libxml_use_internal_errors() should at least suppress the error.

ID's in HTML documents are generally unique, so the best solution would still be validating your document, if at all possible.

Aeolipile answered 6/1, 2009 at 9:40 Comment(0)
V
0

By definition, IDs are unique. If they are not, you should use classes instead (nor names, where it applies).
I doubt you can force XML tools to ignore duplicate IDs, that will make them handle an invalid XML document.

Vogeley answered 6/1, 2009 at 9:40 Comment(0)
J
0

Use Exceptions to treat duplicate IDs, and rename the second id. Or maybe, combine elements in sub-elements of same parent with the ID.

IDs are unique in an XML file (in the rootElement of XMLTree)

Jeremiahjeremias answered 6/1, 2009 at 10:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.