What is the difference between trie and radix trie data structures?
Asked Answered
L

4

136

Are the trie and radix trie data structures the same thing?

If they aren't the same, then what is the meaning of radix trie (AKA Patricia trie)?

Lappet answered 5/2, 2013 at 12:58 Comment(3)
Am I the only one who finds it a bit annoying that the tag is radix-tree rather than radix-trie? There are quite a few questions tagged with it, moreover.Lowndes
@Lowndes Wikipedia titles the radix trie article as Radix tree. Moreover, the term "Radix tree" is widely used in the literature. If anything calling tries "prefix trees" would make more sense to me. After all, they are all tree data structures.Duplet
Also: "What is the meaning of radix trie (AKA Patricia trie)?" this assumes radix trees and PATRICIA trees are one and the same thing, but they are not (e.g. see this answer). PATRICIA trees are trees that you get from running the PATRICIA algorithm (also FYI PATRICIA is an acronym, which stands for "Practical Algorithm To Retrieve Information Coded in Alphanumeric"). The resulting trees can be understood as radix trees with radix = 2, meaning that you traverse the tree by looking up log2(radix)=1 bits of the input string at a time.Duplet
R
171

A radix tree is a compressed version of a trie. In a trie, on each edge you write a single letter, while in a PATRICIA tree (or radix tree) you store whole words.

Now, assume you have the words hello, hat and have. To store them in a trie, it would look like:

    e - l - l - o
  /
h - a - t
      \
       v - e

And you need nine nodes. I have placed the letters in the nodes, but in fact they label the edges.

In a radix tree, you will have:

            *
           /
        (ello)
         /
* - h - * -(a) - * - (t) - *
                 \
                 (ve)
                   \
                    *

and you need only five nodes. In the picture above nodes are the asterisks.

So, overall, a radix tree takes less memory, but it is harder to implement. Otherwise the use case of both is pretty much the same.

Retrorse answered 5/2, 2013 at 13:45 Comment(14)
Thanks...Can you provide me with a good resource to study trie DS ... That would be of great help ...Lappet
I believe only thing I used when I first implemented Trie was the wikipedia article. I am not saying it is perfect but it is good enough.Retrorse
can i say that searching in TRIE is faster than Radix tree? Because in TRIE if you wan to search the next char you need to see the ith index in the child array of the current node but in radix tree you need search for all the child nodes sequentially. See the implementation code.google.com/p/radixtree/source/browse/trunk/RadixTree/src/…Tarah
Actually in a radix tree you can't have more than a single edge starting with the same letter so you can use the same constant indexing.Retrorse
Suppose we have a forth word, hatchet, how would it fit in the upper structures?Disrespectable
@Disrespectable There would probably be a node for the NUL terminating '\0' byte, meaning hat is really h - a - t - \0 and before the \0 there would be another branch for c - h - e - t - \0Plaice
@IvayloStrandjev I would recommend taking it with a grain of salt; either the downvoter couldn't be bothered leaving constructive criticism or they just plain didn't like your answer (e.g. it offended their sensitive souls). It's not a requirement that they leave constructive criticism, after all. Though it is a requirement that you know When should I comment? so you can avoid your comments being flagged for deletion, as technically this kind of comment serves no use for future visitors. Nonetheless, if it helps, I can hazard a guess:Plaice
I think you were downvoted because you seem to be using wikipedia as a reference, wikipedia is only as credible as it's references, and references are often few and far between.Plaice
@Seb in fact I am not using wikipedia as a reference in my answer(though I usually do) I use it in my comment only and in a bit different context.Retrorse
@Tarah Algorithmically Radix is faster than TRIE, that's why its worth doing the compression. Fewer nodes to load and less space are generally better. That said, implementation quality can vary.Singlehanded
@GlennTeitelbaum Have you ever implemented a classical (not wikipedian, rather the way it was originally defined) PATRICIA trie? They come as one node per actual insertion, and insert and find are O(k) worst case time complexity... [edit: though admittedly, they are much more like a graph with cycles pointing back up the tree]Plaice
@ Ivaylo Strandjev - Would really appreciate your feedback on the post stackoverflow.com/questions/40087385/… on Radix Tree. Thnks in advHorst
@IvayloStrandjev, Isn't a radix tree slower when modifying the tree ?Transmissible
Is the example for the radix tree, in this answer, a Patricia Trie?Aronson
P
87

My question is whether Trie data structure and Radix Trie are the same thing?

In short, no. The category Radix Trie describes a particular category of Trie, but that doesn't mean that all tries are radix tries.

If they are[n't] same, then what is the meaning of Radix trie (aka Patricia Trie)?

I assume you meant to write aren't in your question, hence my correction.

Similarly, PATRICIA denotes a specific type of radix trie, but not all radix tries are PATRICIA tries.


What is a trie?

"Trie" describes a tree data structure suitable for use as an associative array, where branches or edges correspond to parts of a key. The definition of parts is rather vague, here, because different implementations of tries use different bit-lengths to correspond to edges. For example, a binary trie has two edges per node that correspond to a 0 or a 1, while a 16-way trie has sixteen edges per node that correspond to four bits (or a hexidecimal digit: 0x0 through to 0xf).

This diagram, retrieved from Wikipedia, seems to depict a trie with (at least) the keys 'A', 'to', 'tea', 'ted', 'ten', 'i', 'in' and 'inn' inserted:

Basic trie

If this trie were to store items for the keys 't' or 'te' there would need to be extra information (the numbers in the diagram) present at each node to distinguish between nullary nodes and nodes with actual values.


What is a radix trie?

"Radix trie" seems to describe a form of trie that condenses common prefix parts, as Ivaylo Strandjev described in his answer. Consider that a 256-way trie which indexes the keys "smile", "smiled", "smiles" and "smiling" using the following static assignments:

root['s']['m']['i']['l']['e']['\0'] = smile_item;
root['s']['m']['i']['l']['e']['d']['\0'] = smiled_item;
root['s']['m']['i']['l']['e']['s']['\0'] = smiles_item;
root['s']['m']['i']['l']['i']['n']['g']['\0'] = smiling_item;

Each subscript accesses an internal node. That means to retrieve smile_item, you must access seven nodes. Eight node accesses correspond to smiled_item and smiles_item, and nine to smiling_item. For these four items, there are fourteen nodes in total. They all have the first four bytes (corresponding to the first four nodes) in common, however. By condensing those four bytes to create a root that corresponds to ['s']['m']['i']['l'], four node accesses have been optimised away. That means less memory and less node accesses, which is a very good indication. The optimisation can be applied recursively to reduce the need to access unnecessary suffix bytes. Eventually, you get to a point where you're only comparing differences between the search key and indexed keys at locations indexed by the trie. This is a radix trie.

root = smil_dummy;
root['e'] = smile_item;
root['e']['d'] = smiled_item;
root['e']['s'] = smiles_item;
root['i'] = smiling_item;

To retrieve items, each node needs a position. With a search key of "smiles" and a root.position of 4, we access root["smiles"[4]], which happens to be root['e']. We store this in a variable called current. current.position is 5, which is the location of the difference between "smiled" and "smiles", so the next access will be root["smiles"[5]]. This brings us to smiles_item, and the end of our string. Our search has terminated, and the item has been retrieved, with just three node accesses instead of eight.


What is a PATRICIA trie?

A PATRICIA trie is a variant of radix tries for which there should only ever be n nodes used to contain n items. In our crudely demonstrated radix trie pseudocode above, there are five nodes in total: root (which is a nullary node; it contains no actual value), root['e'], root['e']['d'], root['e']['s'] and root['i']. In a PATRICIA trie there should only be four. Let's take a look at how these prefixes might differ by looking at them in binary, since PATRICIA is a binary algorithm.

smile:   0111 0011  0110 1101  0110 1001  0110 1100  0110 0101  0000 0000  0000 0000
smiled:  0111 0011  0110 1101  0110 1001  0110 1100  0110 0101  0110 0100  0000 0000
smiles:  0111 0011  0110 1101  0110 1001  0110 1100  0110 0101  0111 0011  0000 0000
smiling: 0111 0011  0110 1101  0110 1001  0110 1100  0110 1001  0110 1110  0110 0111 ...

Let us consider that the nodes are added in the order they are presented above. smile_item is the root of this tree. The difference, bolded to make it slightly easier to spot, is in the last byte of "smile", at bit 36. Up until this point, all of our nodes have the same prefix. smiled_node belongs at smile_node[0]. The difference between "smiled" and "smiles" occurs at bit 43, where "smiles" has a '1' bit, so smiled_node[1] is smiles_node.

Rather than using NULL as branches and/or extra internal information to denote when a search terminates, the branches link back up the tree somewhere, so a search terminates when the offset to test decreases rather than increasing. Here's a simple diagram of such a tree (though PATRICIA really is more of a cyclic graph, than a tree, as you'll see), which was included in Sedgewick's book mentioned below:

Simple PATRICIA diagram

A more complex PATRICIA algorithm involving keys of variant length is possible, though some of the technical properties of PATRICIA are lost in the process (namely that any node contains a common prefix with the node prior to it):

Complex PATRICIA diagram

By branching like this, there are a number of benefits: Every node contains a value. That includes the root. As a result, the length and complexity of the code becomes a lot shorter and probably a bit faster in reality. At least one branch and at most k branches (where k is the number of bits in the search key) are followed to locate an item. The nodes are tiny, because they store only two branches each, which makes them fairly suitable for cache locality optimisation. These properties make PATRICIA my favourite algorithm so far...

I'm going to cut this description short here, in order to reduce the severity of my impending arthritis, but if you want to know more about PATRICIA you can consult books such as "The Art of Computer Programming, Volume 3" by Donald Knuth, or any of the "Algorithms in {your-favourite-language}, parts 1-4" by Sedgewick.

Plaice answered 9/4, 2013 at 15:39 Comment(8)
Would you pls help me understand the significance of the term "Radix"! I understand how, in a natural way, we may try to turn a TRIE into a compact TRIE by allowing multiple symbols/edges coalesce into one edge. However, I'm not able to discern why an un-compacted TRIE (simply a TRIE) cannot be termed as Radix TRIE.Horst
@ Seb - Would really appreciate your feedback on the post #40087885 on Radix Tree. Thnks in adv.Horst
@BuckCherry I'd love to be able to, but please realise as my computer was stolen I wouldn't be able to put the effort into an adequate response.Plaice
> This diagram, retrieved from Wikipedia, seems to depict a trie with (at least) the keys 'A', 'to', 'tea', 'ted', 'ten' and 'inn' inserted: Not to be picky but actually the numbers here are the values, therefore there is also "i" and "in"Grief
@Grief 👍 check again pleasePlaice
@Plaice what happens if I want to insert a key starting with 1011 0011 instead now?Biparty
@Biparty I feel like you stopped reading before I wrote "if you want to know more about PATRICIA you can consult books such as ...". These books go into detail, answering all of the obvious questions with precision so we don't have to duplicate them (potentially erroneously) on the internet, which improves the quality and speed of your education. Nonetheless, the insertion process for PATRICIA begins with a search for the first difference in bit, walking the tree as you search. If the first difference occurs at bit 0, as in your example, then the new node may very well become the root. 🤷‍♂️Plaice
A book by sedgewick is actually what brought me here. I've figured it out in the meantime, but Sedgewick is a pain to read. He says something 30 pages back that actually explains everything but then he never mentions it again, so if I happen to have jumped in the book AFTER that part, I'll be missing some crucial details that maybe things can't be easily understood without.Biparty
H
29

TRIE:
We can have a search scheme where instead of comparing a whole search key with all existing keys (such as a hash scheme), we could also compare each character of the search key. Following this idea, we can build a structure (as shown below) which has three existing keys – “dad”, “dab”, and ”cab”.

         [root]
     ...// | \\...
           |  \
           c   d
           |    \
          [*]    [*]
      ...//|\.  ./|\\...        Fig-I
        a       a
       /       /
     [*]      [*]
 ...//|\..  ../|\\...
    /        /   \
   B        b     d
  /        /       \
 []       []       []

(cab)   (dab)     (dad)

This is essentially an M-ary tree with internal node, represented as [ * ] and leaf node, represented as [ ]. This structure is called a trie. The branching decision at each node can be kept equal to the number of unique symbols of the alphabet, say R. For lower case English alphabets a-z, R=26; for extended ASCII alphabets, R=256 and for binary digits/strings R=2.

Compact TRIE:
Typically, a node in a trie uses an array with size=R and thus causes waste of memory when each node has fewer edges. To circumvent the memory concern, various proposals were made. Based on those variations trie are also named as “compact trie” and “compressed trie”. While a consistent nomenclature is rare, a most common version of a compact trie is formed by grouping all edges when nodes have single edge. Using this concept, the above (Fig-I) trie with keys “dad”, “dab”, and ”cab” can take below form.

         [root]
     ...// | \\...
           |  \
          cab  da
           |    \
          [ ]   [*]                Fig-II
               ./|\\...
                 |  \
                 b   d
                 |    \
                []    []

Note that each of ‘c’, ‘a’, and ‘b’ is sole edge for its corresponding parent node and therefore, they are conglomerated into a single edge “cab”. Similarly, ‘d’ and a’ are merged into single edge labelled as “da”.

Radix Trie:
The term radix, in Mathematics, means a base of a number system, and it essentially indicates the number of unique symbols needed to represent any number in that system. For example, decimal system is radix ten,and binary system is radix two. Using the similar concept, when we’re interested in characterizing a data structure or an algorithm by the number of unique symbols of the underlying representational system, we tag the concept with the term “radix”. For example, “radix sort” for certain sorting algorithm. In the same line of logic, all variants of trie whose characteristics (such as depth, memory need, search miss/hit runtime, etc.) depend on radix of the underlying alphabets, we may call them radix “trie’s”. For example, an un-compacted as well as a compacted trie when uses alphabets a-z, we can call it a radix 26 trie. Any trie that uses only two symbols (traditionally ‘0’ and ‘1’) can be called a radix 2 trie. However, somehow many literatures restricted the use of the term “Radix Trie” only for the compacted trie.

Prelude to PATRICIA Tree/Trie:
It would be interesting to notice that even strings as keys can be represented using binary-alphabets. If we assume ASCII encoding, then a key “dad” can be written in binary form by writing the binary representation of each character in sequence, say as “011001000110000101100100” by writing binary forms of ‘d’, ‘a’, and ‘d’ sequentially. Using this concept, a trie (with Radix Two) can be formed. Below we depict this concept using a simplified assumption that the letters ‘a’,’b’,’c’, and’d’ are from a smaller alphabet instead of ASCII.

Note for Fig-III: As mentioned, to make the depiction easy, let’s assume an alphabet with only 4 letters {a,b,c,d} and their corresponding binary representations are “00”, “01”, “10” and “11” respectively. With this, our string keys “dad”, “dab”, and ”cab” become “110011”, “110001”, and “100001” respectively. The trie for this will be as shown below in Fig-III (bits are read from left to right just like strings are read from left to right).

          [root]
             \1               
              \
              [*]
             0/ \1               
             /   \
           [*]   [*]         
           0/     /               
           /     /0
         [*]    [*]      
        0/      /               
        /      /0
      [*]    [*]
     0/     0/ \1                Fig-III
     /      /   \
    [*]   [*]   [*]
     \1     \1    \1
      \      \     \
      []     []    []
    (cab)   (dab) (dad)
                 

PATRICIA Trie/Tree:
If we compact the above binary trie (Fig-III) using single edge compaction, it would have much less nodes than shown above and yet, the nodes would still be more than the 3, the number of keys it contains. Donald R. Morrison found (in 1968) an innovative way to use binary trie to depict N keys using only N nodes and he named this data structure PATRICIA. His trie structure essentially got rid of single edges (one-way branching); and in doing so, he also got rid of the notion of two kinds of nodes – inner nodes (that don’t depict any key) and leaf nodes (that depict keys). Unlike the compaction logic explained above, his trie uses a different concept where each node includes an indication of how many bits of a key are to be skipped to make the branching decision. Yet another characteristic of his PATRICIA trie is that it doesn’t store the keys – which means such data structure will not be suitable for answering questions like, list all keys that match a given prefix, but is good for finding if a key exists or not in the trie. Nonetheless, the term Patricia Tree or Patricia Trie has, since then, been used in many different but similar senses, such as, to indicate a compact trie [NIST], or to indicate a radix trie with radix two [as indicated in a subtle way in WIKI] and so on.

Trie that may not be a Radix Trie:
Ternary Search Trie (aka Ternary Search Tree) often abbreviated as TST is a data structure (proposed by J. Bentley and R. Sedgewick) which looks very similar to a trie with three-way branching. For such tree, each node has a characteristic alphabet ‘x’ so that branching decision is driven by whether a character of a key is less than, equal to or greater than ‘x’. Due to this fixed 3-way branching feature, it provides a memory-efficient alternative for trie, especially when R (radix) is very large such as for Unicode alphabets. Interestingly, the TST, unlike (R-way) trie, doesn’t have its characteristics influenced by R. For example, search miss for TST is ln(N) as opposed logR(N) for R-way Trie. Memory requirements of TST, unlike R-way trie is NOT a function of R as well. So we should be careful to call a TST a radix-trie. I, personally, don’t think we should call it a radix-trie since none (as far as I know) of its characteristics are influenced by the radix,R, of its underlying alphabets.

Horst answered 12/11, 2016 at 20:50 Comment(10)
As someone who has implemented PATRICIA according to Morrison, Sedgewick and Knuth I can tell you the algorithm you've described here (which I also attempted to describe in my answer) is still very suitable for answering questions like list all keys that match a given prefix. P.S. Great to see someone else on the ball re: that other question :) I like that explanation.Plaice
Re "will not be suitable for answering questions like, list all keys that match a given prefix", seriously?Transmissible
@Transmissible Sure! Classic PATRICIA stores an integer, which you can use as an index for an array. Into the array you put the string. Into the trie you put the 0-based array index for the string. Make the search & compare & bit extraction functions operate upon the string corresponding to the integer rather than the integer, and if your insert function is based on the others (as it should be, since there's a lot of repeated logic there) and you'll be well on your way. You could also use uintptr_t as your integer, since that type seems to be typically expected (though not required) to exist.Plaice
You state "many literatures restricted the use of the term “Radix Trie” only for the compacted trie.". Actually, I can't find any other reference than wikipedia. Did you find any others?Happiness
@ wds - You may be right, as I don't really remember what are the resources I referred when I wrote this. A quick googling gets me links like mathcs.emory.edu/~cheung/Courses/323/Syllabus/Text/trie02.html or tutorialsdiary.com/radix-trie-patricia-trie-or-compressed-trie which essentially point to or (most likely) derived from/influenced by wiki. If I find any other reliable/scholarly resource I'll post here.Horst
@Plaice what happens if given the tree above I want to insert 0011.... which does NOT share the prefix of the other strings?Biparty
@Biparty given that tree is a "prelude to PATRICIA", it's not my example and thus I think you should ask OP, rather than myself. I'll answer a question you posted under my answer, but please reserve this space for comments directed to this answer.Plaice
s/OP/the author/. Ask the author. Or a textbook. Please don't ask me any more of these questions, because it seems like you're not doing much research prior to asking questions.Plaice
@Plaice that's false. I find stackoverflow answers to typically be lacking so it's one of my last resorts. Pat trees are pretty arcane so it's not an easy topic to research. Wikipedia barely mentions them, the original paper is behind a paywall, Sedgewick's book for C algos goes in little detail, Cormen's algorithms book doesn't even mention them. Regardless, I did figure the whole thing out since, so thanks for the help.Biparty
@Biparty the original paper seems to be behind a paywall. I assure you that it's freely available elsewhere. I paid for it, but then found it freely published by the same filename on Google 🤷‍♂️. Nowadays it's even easier to find; it appears as my second search result. Sedgewick describes it just as well as the original, in my opinion. There's also an accurate description of it in Knuths TAOCP.Plaice
G
10

In tries, most of the nodes don’t store keys and are just hops on a path between a key and the ones that extend it. Most of these hops are necessary, but when we store long words, they tend to produce long chains of internal nodes, each with just one child. This is the main reason tries need too much space, sometimes more than BSTs.

enter image description here

Radix tries (aka radix trees, aka Patricia trees) are based on the idea that we can somehow compress the path, for example after "intermediate t node", we could have "hem" in one node, or "idote" in one node.

Here is a graph to compare trie vs radix trie:

enter image description here

The original trie has 9 nodes and 8 edges, and if we assume 9 bytes for an edge, with a 4-byte overhead per node, this means

       9 * 4 + 8 * 9 = 108 bytes.

The compressed trie on the right has 6 nodes and 5 edges but in this case each edge carries a string, not just a character; however, we can simplify the operation by accounting for edge references and string labels separately. This way, we would still count 9 bytes per edge (because we would include the string terminator byte in the edge cost), but we could add the sum of string lengths as a third term in the final expression; the total number of bytes needed is given by

    6 * 4 + 5 * 9 + 8 * 1 = 77 bytes.

For this simple trie, the compressed version requires 30% less memory.

Reference

Galluses answered 24/10, 2021 at 15:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.