Lua table hash indexing is faster than array indexing in my speed test. Why?
Asked Answered
C

1

6

I am doing some tests to see where I can improve the performance of my lua code.

I was reading this document: https://www.lua.org/gems/sample.pdf and I thought using integers as table indices should be considerably faster since it uses the array part of tables and does not require hashing.

So I've written this test program:

    print('local x=0 local y=0 local z=0')
    local x=0 local y=0 local z=0
    t0 = os.clock()
    for i=1,1e7 do
        x = 1
        y = 2
        z = 3
    end
    print(os.clock()-t0 .. "\n")


    print("tab = {1,2,3}")
    tab = {1,2,3}
    t0 = os.clock()
    for i=1,1e7 do
        tab[1] = 1
        tab[2] = 2
        tab[3] = 3
    end
    print(os.clock()-t0 .. "\n")


    print("tab = {[1]=1,[2]=2,[3]=3}")
    tab = {[1]=1,[2]=2,[3]=3}
    t0 = os.clock()
    for i=1,1e7 do
        tab[1] = 1
        tab[2] = 2
        tab[3] = 3
    end
    print(os.clock()-t0 .. "\n")


    print("tab = {a=1,b=2,c=3}")
    tab = {a=1,b=2,c=3}
    t0 = os.clock()
    for i=1,1e7 do
        tab.a = 1
        tab.b = 2
        tab.c = 3
    end
    print(os.clock()-t0 .. "\n")


    print('tab = {["bli"]=1,["bla"]=2,["blu"]=3}')
    tab = {["bli"]=1,["bla"]=2,["blu"]=3}
    t0 = os.clock()
    for i=1,1e7 do
        tab["bli"] = 1
        tab["bla"] = 2
        tab["blu"] = 3
    end
    print(os.clock()-t0 .. "\n")


    print("tab = {verylongfieldname=1,anotherevenlongerfieldname=2,superincrediblylongfieldname=3}")
    tab = {verylongfieldname=1,anotherevenlongerfieldname=2,superincrediblylongfieldname=3}
    t0 = os.clock()
    for i=1,1e7 do
        tab.verylongfieldname = 1
        tab.anotherevenlongerfieldname = 2
        tab.superincrediblylongfieldname = 3
    end
    print(os.clock()-t0 .. "\n")


    print('local f = function(p1, p2, p3)')
    local f = function(p1, p2, p3)
        x = p1
        y = p2
        z = p3
        return x,y,z
    end

    local a=0
    local b=0
    local c=0
    t0 = os.clock()
    for i=1,1e7 do
        a,b,c = f(1,2,3)
    end
    print(os.clock()-t0 .. "\n")


    print('local g = function(params)')
    local g = function(params)
        x = params.p1
        y = params.p2
        z = params.p3
        return {x,y,z}
    end

    t0 = os.clock()
    for i=1,1e7 do
        t = g{p1=1, p2=2, p3=3}
    end
    print(os.clock()-t0 .. "\n")

I've ordered the blocks by what I expected to be increasing time consumption. (I wasn't sure about the function calls, that was just a test.) But here are the surprising results:

    local x=0 local y=0 local z=0
    0.093613

    tab = {1,2,3}
    0.678514

    tab = {[1]=1,[2]=2,[3]=3}
    0.83678

    tab = {a=1,b=2,c=3}
    0.62888

    tab = {["bli"]=1,["bla"]=2,["blu"]=3}
    0.733916

    tab = {verylongfieldname=1,anotherevenlongerfieldname=2,superincrediblylongfieldname=3}
    0.536726

    local f = function(p1, p2, p3)
    0.475592

    local g = function(params)
    3.576475

And even the long field names that should cause the longest hashing process are faster than array accessing with integers. Am I doing something wrong?

Craigie answered 25/12, 2018 at 9:49 Comment(4)
Which version of Lua are you using?Knepper
I'm using Lua 5.2.4 without JITCraigie
Awhile ago I discovered that {[1]=1,[2]=2,[3]=3} actually puts the entries in the hash part, so I would expect it to take longer than {1, 2, 3} or t[1], t[2], t[3] = 1, 2, 3, which put the entries in the array part.Megalith
I've read that as well. It's written in the document that I linked. "t[1], t[2], t[3] = 1, 2, 3" is also an option, I'll include that in the test, thanks. But I still don't get the rest...Craigie
G
4

The 6th page(actual page 20) of the document you linked explains what you are seeing.

If you write something like {[1] = true, [2] = true, [3] = true}, however, Lua is not smart enough to detect that the given expressions (literal numbers, in this case) describe array indices, so it creates a table with four slots in its hash part, wasting memory and CPU time.

You can only gain a major benefit of the array part when you assign a table using no keys.

table = {1,2,3}

If you are reading/writing to a table or array that already exists you will not see a large deviation in processing time.

The example in the document includes the creation of the table in the for loop

for i = 1, 1000000 do
    local a = {true, true, true}
    a[1] = 1; a[2] = 2; a[3] = 3
end

Results with all local variables inside the loops. Edit: Lengthened long string to 40 bytes as pointed out by siffiejoe

local x=0 local y=0 local z=0
0.18

tab = {1,2,3}
3.089

tab = {[1]=1,[2]=2,[3]=3}
4.59

tab = {a=1,b=2,c=3}
3.79

tab = {["bli"]=1,["bla"]=2,["blu"]=3}
3.967

tab = {verylongfieldnameverylongfieldnameverylongfieldname=1,anotherevenlongerfieldnameanotherevenlongerfieldname=2,superincrediblylongfieldnamesuperincrediblylongfieldname=3}
4.013

local f = function(p1, p2, p3)
1.238

local g = function(params)
6.325

Additionally lua preforms the hashes differently for different key types.

The source code can be viewed here 5.2.4 ltable.c, this contains the code I will be discussing.

The mainposition function handles that decision making on which hash to preform

/*
** returns the `main' position of an element in a table (that is, the index
** of its hash value)
*/
static Node *mainposition (const Table *t, const TValue *key) {
  switch (ttype(key)) {
    case LUA_TNUMBER:
      return hashnum(t, nvalue(key));
    case LUA_TLNGSTR: {
      TString *s = rawtsvalue(key);
      if (s->tsv.extra == 0) {  /* no hash? */
        s->tsv.hash = luaS_hash(getstr(s), s->tsv.len, s->tsv.hash);
        s->tsv.extra = 1;  /* now it has its hash */
      }
      return hashstr(t, rawtsvalue(key));
    }
    case LUA_TSHRSTR:
      return hashstr(t, rawtsvalue(key));
    case LUA_TBOOLEAN:
      return hashboolean(t, bvalue(key));
    case LUA_TLIGHTUSERDATA:
      return hashpointer(t, pvalue(key));
    case LUA_TLCF:
      return hashpointer(t, fvalue(key));
    default:
      return hashpointer(t, gcvalue(key));
  }
}

When the key is a Lua_Number we call hashnum

/*
** hash for lua_Numbers
*/
static Node *hashnum (const Table *t, lua_Number n) {
  int i;
  luai_hashnum(i, n);
  if (i < 0) {
    if (cast(unsigned int, i) == 0u - i)  /* use unsigned to avoid overflows */
      i = 0;  /* handle INT_MIN */
    i = -i;  /* must be a positive value */
  }
  return hashmod(t, i);
}

Here are the other hash implementations for the other types:

#define hashpow2(t,n)           (gnode(t, lmod((n), sizenode(t))))

#define hashstr(t,str)          hashpow2(t, (str)->tsv.hash)
#define hashboolean(t,p)        hashpow2(t, p)


/*
** for some types, it is better to avoid modulus by power of 2, as
** they tend to have many 2 factors.
*/
#define hashmod(t,n)    (gnode(t, ((n) % ((sizenode(t)-1)|1))))


#define hashpointer(t,p)        hashmod(t, IntPoint(p))

These hashes resolve down to 2 paths hashpow2 and hashmod. LUA_TNUMBER use hashnum > hashmod and LUA_TSHRSTR use hashstr > hashpow2

Glaciate answered 26/12, 2018 at 19:4 Comment(2)
Thank you for your elaborate answer! I have seen the part in the document you referred to. The things that I am wondering about are, why is table access using integers in the hash part so much slower than everything else? Why is table access using strings in the hash part faster than using integers in the hash part and even than using integers in the array part? Why does the access get even faster the longer the strings are? Shouldn't hashing take considerably more time than pure array indexing? And be more effort the longer the strings are?Craigie
@Craigie The time needed for hashing isn't taken into account in your micro-benchmarks at all. Even your "long" strings are less than 40 bytes (LUAI_MAXSHORTLEN), so Lua treats them as short strings. Short strings are hashed when they are created, which happens at compile/load time in your case, because all your keys are constants in your code. Even real long strings (> 40 bytes) would be hashed once at first index operation, which is before your measurement loop, but at least you should see some delay for the actual string comparison.Coney

© 2022 - 2024 — McMap. All rights reserved.