If you stumble on this question because you want to search and remove elements with ElementTree
- using built in xml module (not lxml)
- being as flexible as ElementTree.findall (using xpath subset)
- directly referencing the elements to be deleted, not the parents
- work on any nesting level
- work even if found elements are nested in other found elements
Then this function may help. It builts and uses a map from the element to its parents.
import itertools
from xml.etree import ElementTree
def deleteall(root: ElementTree.Element, match, namespaces=None):
parent_by_child=dict(itertools.chain.from_iterable(
((child, element) for child in element) for element in root.iter()))
for element in root.findall(match, namespaces):
parent_by_child[element].remove(element)
Additional checks as required in the original post can be done by a Callable provided as additional argument:
import itertools
from typing import Callable
from xml.etree import ElementTree
def deleteall(
root: ElementTree.Element,
match,
namespaces=None,
deletion_criteria: Callable[[ElementTree.Element], bool]=lambda x: True
):
parent_by_child=dict(itertools.chain.from_iterable(
((child, element) for child in element) for element in root.iter()))
for element in root.findall(match, namespaces):
if deletion_criteria(element):
parent_by_child[element].remove(element)
Further extensions like providing both the element and its parent to the deletion criteria would be possible.