Deleting a node in a BK Tree
Asked Answered
L

1

6

I have seen many different implementations of BK Trees in many different languages, and literally none of them seem to include a way to remove nodes from the tree.

Even the original article where BK Trees were first introduced does not provide a meaningful insight about node deletion, as the authors merely suggest to mark the node to be deleted so that it is ignored:

The deletion of a key in Structures 1 [the BK Tree] and 2 follows a process similar to that above, with special consideration for the case in which the key to be deleted is the representative x° [root key]. In this case, the key cannot simply be deleted, as it is essential for the structure information. Instead an extra bit must be used for each key which denotes whether the key actually corresponds to a record or not. The search algorithm is modified correspondingly to ignore keys which do not correspond to records. This involves testing the extra bit in the Update procedure.

While it may be theoretically possible to properly delete a node in a BK Tree, is it possible to do so in linear/sublinear time?

Lundberg answered 14/2, 2017 at 15:55 Comment(0)
B
2

While it may be theoretically possible to properly delete a node in a BK Tree, is it possible to do so in linear/sublinear time?

If you want to physically remove it from a BK-Tree, then I can't think of a way to do this in a linear time for all cases. Consider 2 scenarios, when a node is removed. Please note that I do not account for a time complexity related to calculating the Levenshtein distance because that operation doesn't depend on the number of words, although it requires some processing time too.

Remove non-root node

  1. Find a parent of the node in the tree.
  2. Save node's child nodes.
  3. Nullify parent's reference to the node.
  4. Re-add each child node as if it were a new node.

Here, even if step 1 can be done in O(1), steps 2 and 4 are way more expensive. Inserting a single node is O(h), where h is a height of tree. To make matters worse, this has to be done for each child node of the original node, and so it will be O(k*h), where k is a number of child nodes.

Remove root node

  1. Rebuild the tree from scratch without using the previous root node.

Rebuilding a tree will be at least O(n) in the best case and O(h*n) otherwise.

Alternative solution

That's why it's better not to delete a node physically, but keep it in a tree and just mark it as deleted. This way it will be used, as before, for inserting new nodes, but will be excluded from suggestion results for a misspelled word. This can be done in O(1).

Buckler answered 14/4, 2021 at 22:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.