I have a huge xml file (1 Gig). I want to move some of the elements (entrys) to another file with the same header and specifications.
Let's say the original file contains this entry with tag <to_move>
:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE some SYSTEM "some.dtd">
<some>
...
<to_move date="somedate">
<child>some text</child>
...
...
</to_move>
...
</some>
I use lxml.etree.iterparse to iterate through the file. Works fine. When I find the element with tag <to_move>
, let's assume it is stored in the variable element
I do
new_file.write(etree.tostring(element))
But this results in
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE some SYSTEM "some.dtd">
<some>
...
<to_move xmlns:="some" date="somedate"> # <---- Here is the problem. I don't want the namespace.
<child>some text</child>
...
...
</to_move>
...
</some>
So the question is: How to tell etree.tostring() not to write the xmlns:="some"
. Is this possible? I struggeled with the api-documentation of lxml.etree, but I couldn't find a satisfying answer.
This is what I found for etree.trostring
:
tostring(element_or_tree, encoding=None, method="xml",
xml_declaration=None, pretty_print=False, with_tail=True,
standalone=None, doctype=None, exclusive=False, with_comments=True)
Serialize an element to an encoded string representation of its XML tree.
To me every one of the parameters of tostring()
does not seem to help. Any suggestion or corrections?