Lua: How to look up in a table where the keys are tables (or objects)
Asked Answered
S

7

11

I want to store a lua table where the keys are other lua tables. I know that this is possible BUT I want to be able to do look ups in the table using copies of those tables. Specifically, I want to be able to do:

t = {}
key = { a = "a" }
t[key] = 4
key2 = { a = "a" }

and then I want to be able to look up:

t[key2]

and get 4.

I know that I can turn key into a string and put it into table t. I've also thought about writing a custom hash function or doing this by nesting tables. Is there a best way for me to get this type of functionality? What other options do I have?

Selfrising answered 8/2, 2012 at 21:20 Comment(0)
T
10

In Lua, two tables created separately are considered "different". But if you create a table once, you can assign it to any variables you want, and when you compare them, Lua will tell you that they are equal. In other words:

t = {}
key = { a = "a" }
t[key] = 4
key2 = key
...
t[key2] -- returns 4

So, that's the simple, clean way of doing what you want. Store key somewhere, so you can retrieve the 4 back by using it. This is also very fast.

If you really don't want to do that ... well, there is a way. But it is kindof inefficient and ugly.

The first part is making a function that compares two separate tables. It should return true if two tables are "equivalent", and false if they are not. Let's call it equivalent. It should work like this:

equivalent({a=1},{a=1})          -- true
equivalent({a=1,b=2}, {a=1})     -- false
equivalent({a={b=1}}, {a={b=2}}) -- false

The function must be recursive, to handle tables that contain tables themselves. It also must not be fooled if one of the tables "contains" the other, but has more elements. I came out with this implementation; probably there are better ones out there.

local function equivalent(a,b)
  if type(a) ~= 'table' then return a == b end

  local counta, countb = 0, 0

  for k,va in pairs(a) do
    if not equivalent(va, b[k]) then return false end
    counta = counta + 1
  end

  for _,_ in pairs(b) do countb = countb + 1 end

  return counta == countb
end

I'm not going to explain that function here. I hope it is clear enough what it does.

The other part of the puzzle consist on making t use the equivalent function when comparing keys. This can be done with careful metatable manipulation, and an extra "storage" table.

We basically transform t into an impostor. When our code tells it to store a value under a key, it doesn't save it in itself; instead it gives it to the extra table (we'll call that store). When the code asks t for a value, it searches for it in store, but using the equivalent function to get it.

This is the code:

local function equivalent(a,b)
... -- same code as before
end

local store = {} -- this is the table that stores the values

t = setmetatable({}, {
  __newindex = store,
  __index = function(tbl, key)
    for k,v in pairs(store) do
      if equivalent(k,key) then return v end
    end
  end
})

Usage example:

t[{a = 1}] = 4

print(t[{a = 1}]) -- 4
print(t[{a = 1, b = 2}]) -- nil
Turkish answered 8/2, 2012 at 22:41 Comment(0)
B
5

kikito's answer is good, but has some flaws:

  • If you perform t[{a=1}] = true two times, store will contain two tables (leaking memory for the lifetime of the hash table)
  • Modifying the value once you've already stored it doesn't work, nor can you remove it. Attempting to change it will result in the retrieval potentailly returning any value you've assigned to that key in the past.
  • Access performance is O(n) (n being the number of stored entries and assuming that lua's value retrieval from a table is O(1)); combined with the first point, performance of this hash table will degrade with use

(Also note that kikito's "equivalent" function will cause an infinite loop if any table has a circular reference.)

If you never need to change/remove any information in the table, then kikito's answer will suffice as it stands. Otherwise, the metatable must be changed so that the __newindex makes sure that the table doesn't already exist:

t = setmetatable({}, {
    __newindex = function(tbl, key, value)
        for k,v in pairs(store) do
            if equivalent(k,key) then
                tbl[k] = value
                return
            end
        end
        store[key] = value
    end,
    __index = function(tbl, key)
        for k,v in pairs(store) do
            if equivalent(k, key) then return v end
        end
    end
})

As you've suggested, a completely different option is to write a custom hashing function. Here's a HashTable that can make use of that:

local function HashTable(Hash, Equals)
    --Hash is an optional function that takes in any key and returns a key that lua can use (string or number). If you return false/nil, it will be assumed that you don't know how to hash that value.
    --    If Hash is not provided, table-keys should have a GetHash function or a .Hash field
    --Equals is an optional function that takes two keys and specify whether they are equal or not. This will be used when the same hash is returned from two keys.
    --    If Equals is not provided, items should have a Equals function; items are in this case assumed to not be equal if they are different types.
    local items = {} --Dict<hash, Dict<key, value>>
    local function GetHash(item)
        return Hash and Hash(item) or type(item) == "table" and (item.GetHash and item:GetHash() or item.Hash) or item
    end
    local function GetEquals(item1, item2)
        if Equals then return Equals(item1, item2) end
        local t1, t2 = type(item1), type(item2)
        if t1 ~= t2 then return false end
        if t1 == "table" and item1.Equals then
            return item1:Equals(item2)
        elseif t2 == "table" and item2.Equals then
            return item2:Equals(item1)
        end
        return false
    end
    return setmetatable({}, {
        __newindex = function(_, key, value)
            local hash = GetHash(key)
            local dict = items[hash]
            if not dict then
                if value ~= nil then --Only generate a table if it will be non-empty after assignment
                    items[hash] = {[key] = value}
                end
                return
            end
            for k, v in pairs(dict) do
                if GetEquals(key, k) then --Found the entry; update it
                    dict[k] = value
                    if value == nil then --check to see if dict is empty
                        if next(dict) == nil then
                            items[hash] = nil
                        end
                    end
                    return
                end
            end
            --This is a unique entry
            dict[key] = value
        end,
        __index = function(_, key)
            local hash = GetHash(key)
            local dict = items[hash]
            if not dict then return nil end
            for k, v in pairs(dict) do
                if GetEquals(key, k) then
                    return v
                end
            end
        end
    })
end

Usage example:

local h = HashTable(
    function(t) return t.a or 0 end, --Hash
    function(t1, t2) return t1.a == t2.a end) --Equals
h[{a=1}] = 1
print(h[{a=1}]) -- 1
h[{a=1}] = 2
print(h[{a=1}]) -- 2
print(h[{a=1,b=2}]) -- 2 because Hash/Equals only look at 'a'

Naturally, you'll want to get better Hash/Equals functions.

So long as the hashes of your keys rarely collide, this performance of this class should be O(1).

(Note: I'd have put the top half of this answer as a comment to kikito, but I don't have the reputation at this time to do so.)

Butadiene answered 10/2, 2017 at 4:59 Comment(0)
S
2

This is not possible in Lua. If you use tables as keys, the key is that specific "instance" of the table. Even if you make a different table with the same contents, the instance is different, therefore it is a different key.

If you want to do something like this, you can create a kind of hash function, which traverses the table to serve as a key (maybe even recursively if needed) and construct a string representation of the table content. It does not need to be human-readable, as long as it is different for different content and equal for tables with the same content. Apart from using pairs() to traverse the table, you would also need to insert the keys into a table and sort them using table.sort(), because pairs() returns them in an arbitrary order, and you want the same string for "equal" tables.

Once you have constructed such string, you can use it as a key:

function hash(t) ... end
t = {}
key1 = { a = "a", b = "b" }
t[hash(key1)] = 4
key2 = { a = "a", b = "b" }
print(t[hash(key2)]) -- should print "4" if the hash function works correctly

In my opinion, this all is too complicated for the simple task of indexing, and you may want to re-think your wish for indexing using copies of tables. Why would you want such functionality?

Update

If you only need to work with phrases, I think that concatenating them is easier than creating such generic hash function. If you need it for sequences of phrases, you won't actually need to iterate through the tables and sort the keys, just collect the main information from each phrase. You would still need to use a helper function, which can create a suitable key for you:

function pkey(...)
    local n, args = select('#', ...), { ... }
    for i=1,n do args[i] = args[i].value end -- extract your info here
    return table.concat(args, ' ') -- space or other separator, such as ':'          
end
tab[pkey(phrase1, phrase2, phrase3)] = "value"
Subsidence answered 8/2, 2012 at 22:28 Comment(3)
Thanks for the response. The reason I wanted this was for a NLP task. I extract phrases which I store as lua tables (each token in the phrase as a value that is mapped to an index using table.insert) and I want to count frequency of phrases. I know that there are other ways to do what I want (e.g. concatenate the phrase and use that concatenated string as a key) but it would require extra implementation steps and it wouldn't be quite as clean. I'm pretty sure you can do what I want in Java and, being new to lua, I'm trying to see if there is an analogSelfrising
Such a hash function is hard to write because the order that tables are traversed depends on how it was created and so tables with the same entries may have different traversals.Unguent
That is why I said about collecting the keys into a table and sorting it to ensure consistent key order.Subsidence
C
1

I don't know a lot about Language Processing, and about the goal you want to reach with your program, but what about collecting token like this : use a nested table structure such has the index table store only tables indexed by first phrase token, then each subtables contains value indexed by second phrase token ... etc ... until you reach a phrase final token, will index an number value corresponding to he occurence of the phrase.

Maybe it will be more clear with a exemple, if you have the two following phrase :

  • I like banana.
  • I like hot chick.

Your index would have the following structure :

index["I"] = {
    ["like"] = {
        ["banana"] = 1,
        ["hot"] = {
            ["chick"] = 1
        }
    }    
}

In that way you can count frenquencies with a single traversal step, and count occurences at the same time you indexing, but as i said before, it depends of what is your goal, and it will imply to re - split you phrase so as to find occurences through your index.

Crawl answered 16/2, 2012 at 20:45 Comment(3)
what I'm really looking for is this: if I have a = { "I", "Like", "Banana" } and b = { "I", "Like", "Banana" } and I write t[a] = "zoo", I would like a constant time scheme where t[b] == "zoo".Selfrising
as it was said before it is impossible to do it, you will have to do at some point a manual comparaison by iterating over the table value.Crawl
But what if he also has the phrase "I like hot" in addition to "I like hot chick"? Where would he store its "= 1"?Dignadignified
L
0

I'm not sure you can do this. You can define equality for tables using the metatable, but there's no way to define a hash function, and I doubt defining equality alone will do what you need. You could obviously define equality, then iterate over the table using pairs() and comparing the keys yourself, but that will turn what should be O(1) lookup into O(n).

Lovellalovelock answered 8/2, 2012 at 21:27 Comment(0)
R
0

kikito's answer has the beginnings of a solution but, as chess123mate's answer notes, it's write-only (among other flaws). This solution doesn't fix all of them, but it's a start. (It's also very, very slow.)

local function equivalent(stack)
    local a, b = stack.a, stack.b

    if type(a) ~= 'table' then return a == b end
    if a == b then return true end

    local top = stack.next
    while top ~= nil do
        if stack.a == a and stack.b == b then return true end
        top = stack.next
    end

    local counta, countb = 0, 0

    for k,va in pairs(a) do
        if not equivalent({a = va, b = b[k], next = stack}) then return false end
        counta = counta + 1
    end

    for _,_ in pairs(b) do countb = countb + 1 end

    return counta == countb
end

t = setmetatable({}, {
    __newindex = function(tbl, key, value)
        for k, _ in pairs(tbl) do
            if equivalent({a = k, b = key}) then rawset(tbl, k, value); return end
        end
        rawset(tbl, key, value)
    end,
    __index = function(tbl, key)
        for k, v in pairs(tbl) do
            if equivalent({a = k, b = key}) then return v end
        end
    end
})

Lua Playground link (or GitHub Gist link for copying and pasting into the playground, if your browser hates me for that last one).

Regulate answered 22/2, 2021 at 18:17 Comment(0)
C
0

I think the simplest way to do something like this would be to have a tableKey factory which returns readonly tables for the inputted table.

You would want to assign a __key (or similar) value to use as the keys. Something like:

_cachedTableKeys = {}
function tableKey (t)
  if(not t.__key) t.__key = calculateKey(t)
  local existing = _cachedTableKeys[t.__key]
  if(existing) return existing
  existing = readOnly(t) -- https://www.lua.org/pil/13.4.5.html
  _cachedTableKeys[t.__key] = existing
  return existing 
end

This will ensure that the table being used as the key is readonly and that it's id will always match if it is equal.

Note: the calculateKey function would perform a hash function recursively on the table keys. You want to use a large enough hash (like 16 bytes or so) so that collision is near impossible. You could also implement some kind of collision algorithm when looking up the table by key.

Cuenca answered 13/7, 2023 at 17:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.