Trie Implementation With Map
Asked Answered
U

5

7

i was working solving a problem today. But i got stucked. I know how a trie works but the problem is that i know how to implement it with static arrays and classes. Today surfing on the web i read that there is a way to implement tries using stl::map. I tried today but i still dont know how to insert elements on int. this struchture.

Edit1: i am trying to solve this problem :spoj.com/problems/TAP2012D I want to know how to add the words in to the trie with edit2: I know how a map works, i just dont know how a trie with a map works. I want someone that knows about tries.

Here is what ive done so far

const int ALPH_SIZE = 26;
using namespace std;

struct trie{
    map<char,int> M;
    int x,y;
    trie();
};

trie T[1000000];


trie::trie()
{
    x=y=0;
}
int maximo;


void addtrie(string palabra)
{
    int tam=palabra.size();
    int pos=0;
    for(int i=0;i<tam;i++)
    {
        if(T[pos].M.find(palabra[i])==T[pos].M.end())
        {
            T[pos].M[palabra[i]]=new trie();
            T[pos].M[palabra[i]]=
        }

    }

}
Unmoor answered 15/2, 2013 at 2:18 Comment(8)
your real issue is? also trie T[1000000]; may stack overflowJasisa
@Jasisa i dont know how to add elements. I mean the add function, i want to add elements on itUnmoor
you mean add elements to M ?Jasisa
@Jasisa i am doing this problem spoj.com/problems/TAP2012D I want to add words in to the Trie (the tree). Ill edit that.Unmoor
There is a lot of publicly accessible documentation on std::map, e.g. en.cppreference.com/w/cpp/container/mapCounterpoison
If I'm reading this correctly, the real issue is that OP has read that you can implement tries with maps, but doesn't have the slightest clue how those two fit together. It doesn't seem like a specific problem about maps, or adding, or anything.Tarlatan
you are right , i dont know how to implement a trie with a map struchture, thats my issue. @Counterpoison i know how to use a map. I dont know how to use it with a trie, thats my problemUnmoor
Well, each node in the trie must include a mapping from characters to destination nodes, right? So you can either have some sort of map<char,node*> in each node, or you can have something along the lines of map<pair<node*,char>,node*> once for the complete trie, which maps (node,character)-combinations to destination nodes, right? I'd start with that as a concept and then think about who should own the pointers, and optimization of data types.Counterpoison
H
8

A trie node stores a map of existing out characters and a flag indicating if the node corresponds to a word in the trie.

struct Node
{   map<char, Node*> a;
    bool flag;

    Node() { flag = false; }
};

Now insertion is similar to what you'd do with a static array, except that you are using a map here.

void insert(Node *x, string s)
{   for(int i = 0; i < s.size(); i++)
    {   if(x->a.count(s[i]) == 0)
        /* no outgoing edge with label = s[i] so make one */
        {   x->a[ s[i] ] = new Node;
        }
        x = x->a[ s[i] ];
    }
    x->flag = true; /* new word */
}
Hear answered 21/1, 2015 at 20:32 Comment(0)
H
1

Using an unordered_map is better in my opinion.

    struct TrieNode {
        char c;
        unordered_map<char, TrieNode*>links;
        bool end;
    };

    TrieNode* insert(TrieNode* root, string word) {
        TrieNode* current  = root;

        for (auto it: word) {
           if (current->links.find(it) == current->links.end()) {
           TrieNode* node = new TrieNode(); // possible memory leak?
           node->c = it;
           node->links = {};
           node->end = false;

           current->links[it] = node;
           }

        current = current->links[it];
       }

    current->end = true;
    return root;
    };

Ofcourse, there could be an issue of having memory leaks with the TrieNodes that you create with the new operator. Maybe some sort of a tree traversal (DFS based) to visit all the nodes in a bottom up fashion and deleting them, could help one avoid memory leaks.

Hiatt answered 18/11, 2015 at 15:21 Comment(1)
While generally unordered_map is faster, max depth is only 5 for the 26 letters, so I'm not sure. And a map gives the ability to walk in sorted order. Also, the question really boxes in the use of std::map. There is also no need to store char c in TrieNode its stored in the parent as the map key.Griffey
G
0

The issue with using a map is that you lose locality. And aside from narrowing the possible keys, no advantage is gained by having an actual alphabet.

Parsing one character at a time will hop around memory. If the string is parsed three characters at a time, then there will be much higher locality, both for the sequence loading for a given string and the chance of cheaply loading its neighbors as they remain in cache.

This also takes advantage of some later language features, most of which easily be accomplished by loading the char<256> manually. Likely there are better ways to do this as well.

#include <map>
#include <array>
#include <string>
#include <cstddef>
#include <cstdint>
#include <iostream>

// MAP
template<char ... Args>
class mkmap;

// Walk the tuple
template<char count, char tuple_sz, char tuple_cnt, char grab, char ... alpha>
class mkmap<count, tuple_sz, tuple_cnt, grab, alpha...> {
public:
    static constexpr void map(std::array<char, 256>& map) {
        map[grab] = count;
        mkmap<count, tuple_sz, tuple_cnt - 1, alpha...>::map(map);
    }
};

// Next tuple
template<char count, char tuple_sz, char grab, char ... alpha>
class mkmap<count, tuple_sz, 1, grab, alpha...> {
public:
    static constexpr void map(std::array<char, 256>& map) {
        map[grab] = count;;
        mkmap<count + 1, tuple_sz, tuple_sz, alpha...>::map(map);
    }
};

// End recursion
template<char count, char tuple_sz, char tuple_cnt>
class mkmap<count, tuple_sz, tuple_cnt> {
public:
    static constexpr void map(std::array<char, 256>& map) {
    }
};

template<int tuple_sz, char ... alpha>
class cvtmap {
public:
    constexpr cvtmap() : map{} {
        mkmap<1, tuple_sz, tuple_sz, alpha...>::map(map);
    }
    constexpr char operator[](char input) const {
        return map[input];
    }
    std::array<char, 256> map;
};

// UNMAP
template<char ... Args>
class mkunmap;

// Walk the tuple
template<char count, char tuple_sz, char tuple_cnt, char grab, char ... alpha>
class mkunmap<count, tuple_sz, tuple_cnt, grab, alpha...> {
public:
    static constexpr void map(std::array<std::array<char, tuple_sz>, 256>& map) {
        map[count][tuple_sz - tuple_cnt] = grab;
        mkunmap<count, tuple_sz, tuple_cnt - 1, alpha...>::map(map);
    }
};

// Next tuple
template<char count, char tuple_sz, char grab, char ... alpha>
class mkunmap<count, tuple_sz, 1, grab, alpha...> {
public:
    static constexpr void map(std::array<std::array<char, tuple_sz>, 256>& map) {
        map[count][tuple_sz - 1] = grab;;
        mkunmap<count + 1, tuple_sz, tuple_sz, alpha...>::map(map);
    }
};

// End recursion
template<char count, char tuple_sz, char tuple_cnt>
class mkunmap<count, tuple_sz, tuple_cnt> {
public:
    static constexpr void map(std::array<std::array<char, tuple_sz>, 256>& map) {
    }
};

template<int tuple_sz, char ... alpha>
class cvtunmap {
public:
    constexpr cvtunmap() : map{} {
        mkunmap<1, tuple_sz, tuple_sz, alpha...>::map(map);
    }
    constexpr std::array<char, tuple_sz> operator[](char input) const {
        return map[input];
    }
    std::array<std::array<char, tuple_sz>, 256> map;
};

template<int tuple_sz, char ... alpha>
class cvt
{
public:
    enum consts : char { SENTINAL = 0 };

    static constexpr int size() { return sizeof...(alpha) / tuple_sz + 1; }
    cvt(char c) : a{ map[c] } {
    }
    char to_char()
    {
        return unmap[a][0];
    }
    unsigned short value() const {
        return a;
    }
private:
    char a;
    static const cvtmap<tuple_sz, alpha...> map;
    static const cvtunmap<tuple_sz, alpha...> unmap;
};

template<int tuple_sz, char ... alpha>
const cvtmap<tuple_sz, alpha...> cvt<tuple_sz, alpha...>::map;

template<int tuple_sz, char ... alpha>
const cvtunmap<tuple_sz, alpha...> cvt<tuple_sz, alpha...>::unmap;

using ASCII_ignore_case = cvt <2,
    'a', 'A', 'b', 'B', 'c', 'C', 'd', 'D', 'e', 'E', 'f', 'F', 'g', 'G', 'h', 'H', 'i', 'I', 'j', 'J', 'k', 'K', 'l', 'L', 'm', 'M', 'n', 'N', 'o', 'O', 'p', 'P', 'q', 'Q', 'r', 'R', 's', 'S', 't', 'T', 'u', 'U', 'v', 'V', 'w', 'W', 'x', 'X', 'y', 'Y', 'z', 'Z'
>;

template <class alphabet>
class Node {
public:
    enum consts { SHIFT = 32 };
    static short make_key(alphabet a, alphabet b, alphabet c) {
        // max is Z (26) * 27 * 27 == 18954 which fits under SHRT_MAX (32767)
        return
            a.value() * SHIFT * SHIFT
            + b.value() * SHIFT
            + c.value();
    }
    static std::array<char, 3> to_string(short x) {
        char a = (x / (SHIFT * SHIFT)) & 0xFF;
        x -= a * SHIFT * SHIFT;
        char b = (x / SHIFT) &0xFF;
        x -= b * SHIFT;
        char c = x &0xFF;
        return { a,b,c };
    }
    Node* add(short key) {
        if (idx.contains(key)) {
            return idx[key];
        }
        Node* ret = new Node;
        idx[key] = ret;
        return ret;
    }
    static Node* sentinal() {
        static Node fixed;
        return &fixed;
    }
    void add_final(short key) {
        if (!idx.contains(key)) {
            idx[key] = sentinal();
        }
    }
    const Node* get(short key) const {
        auto it = idx.find(key);
        if (it != idx.end()) { // avoid creating nodes
            return it->second;
        }
        return 0;
    }
    bool is_final(short key) const {
        auto it = idx.find(key);
        if (it != idx.end()) { // avoid creating nodes
            return it->second == sentinal();
        }

        return false;
    }
    ~Node() = default;
private:
    std::map <short, Node*> idx;
};

template <class alphabet>
class TriTrie {
public:
    void add(std::string& str) {
        std::string::iterator i = str.begin();
        const std::string::iterator e = str.end();
        Node<alphabet>* where = &top;
        for (;;) {
            std::size_t len = e - i;
            alphabet a = alphabet::SENTINAL;
            alphabet b = alphabet::SENTINAL;
            alphabet c = alphabet::SENTINAL;
            switch (len)
            {
            default: [[likely]] {
                a = alphabet(*(i++));
                b = alphabet(*(i++));
                c = alphabet(*(i++));

                short key = Node<alphabet>::make_key(a,b,c);
                where = where->add(key);
            }
                   break;
            case 3:
                c = alphabet(*(i + 2));
                [[fallthrough]];
            case 2:
                b = alphabet(*(i + 1));
                [[fallthrough]];
            case 1:
                a = alphabet(*i);
                [[fallthrough]];
            case 0: {
                short key = Node<alphabet>::make_key(a, b, c);
                where->add_final(key);
                return;
            }
            }
        }
    }
    bool contains(std::string& str) const {
        std::string::iterator i = str.begin();
        const std::string::iterator e = str.end();
        const Node<alphabet>* where = &top;
        while (where) {
            std::size_t len = e - i;
            alphabet a = alphabet::SENTINAL;
            alphabet b = alphabet::SENTINAL;
            alphabet c = alphabet::SENTINAL;
            switch (len) {
                default: [[likely]] {
                    a = alphabet(*(i++));
                    b = alphabet(*(i++));
                    c = alphabet(*(i++));

                    short key = Node<alphabet>::make_key(a,b,c);
                    where = where->get(key);
                }
                    break;
                case 3:
                    c = alphabet(*(i + 2));
                    [[fallthrough]];
                case 2:
                    b = alphabet(*(i + 1));
                    [[fallthrough]];
                case 1:
                    a = alphabet(*i);
                    [[fallthrough]];
                case 0: {
                    short key = Node<alphabet>::make_key(a, b, c);
                    return where->is_final(key);
                }
            }
        }
        return false;
    }
private:
    Node<alphabet> top;
};

using ASCII_TriTrie = TriTrie<ASCII_ignore_case>;


int main()
{
    ASCII_TriTrie tt;

    for (std::string s : {
        "hello", "goodbye", "big", "little", "hi", "hit", "hitch", "him"
    }) {
        std::cout << s << ":" << std::endl;
        if (tt.contains(s)) {
            return -1;
        }
        tt.add(s);
        if (!tt.contains(s)) {
            return -2;
        }
    }

    return 0;
}
Griffey answered 18/10, 2021 at 2:53 Comment(0)
A
0

from my opinion,I think that you can also plus a check before adding the elements.

    StrCheck(str); // check string is empty

    if (root == nullptr) // check root is nullptr
    {
        root = new Node<T>;
    }

void StrCheck(string str)
 {  
    if (str.empty())
    {
        throw exception("str is empty!");
    }
  }
Amil answered 5/3, 2022 at 14:15 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Mejia
C
-5

The correct way to insert elements into a stl::map should be done the following way

std::map<char,int> M;


  M.insert ( std::pair<char,int>('aaa',1) );
  M.insert ( std::pair<char,int>('bbb',2) );
Centriole answered 15/2, 2013 at 2:45 Comment(3)
Thanks for your answer, but im trying to implement a trie using a map.Unmoor
Actually, M[palabra[i]] = ... is perfectly fine for inserting elements in the map.Counterpoison
' denotes char literal. You can't use it to denote strings. It's a syntax error to write 'aaa' or 'bbb'Tramp

© 2022 - 2024 — McMap. All rights reserved.