I am trying to check whether an xml file contains the necessary xml declaration ("header"), let's say:
<?xml version="1.0" encoding="UTF-8"?>
...rest of xml file...
I am using xml ElementTree for reading and getting info out of the file, but it seems to load a file just fine even if it does not have the header.
What I tried so far is this:
import xml.etree.ElementTree as ET
tree = ET.parse(someXmlFile)
try:
xmlFile = ET.tostring(tree.getroot(), encoding='utf8').decode('utf8')
except:
sys.stderr.write("Wrong xml2 header\n")
exit(31)
if re.match(r"^\s*<\?xml version=\'1\.0\' encoding=\'utf8\'\?>\s+", xmlFile) is None:
sys.stderr.write("Wrong xml1 header\n")
exit(31)
But the ET.tostring() function just "makes up" a header if it is not present in the file.
Is there any way to check for a xml header with ET? Or somehow throw an error while loading the file with ET.parse, if a file does not contain the xml header?