Lua's hybrid array and hash table; does it exist anywhere else?

Asked 24/1, 2010 at 2:21 Answered 27/7, 2016 at 4:11

Solved arrays data-structures hash lua lua-table

Lua's implementation of tables keep its elements in two parts: an array part and a hash part.

Does such a thing exist in any other languages?

Take a look at section 4, Tables, in The Implementation of Lua 5.0.

Estabrook answered 24/1, 2010 at 2:21 Comment(0)

This idea was original with Roberto Ierusalimschy and the rest of the Lua team. I heard Roberto give a talk about it at the MIT Lightweight Languages workshop in 2003, and in this talk he discussed prior work and argued convincingly that the idea was new. I don't know if other languages have copied it since.

The original Awk has a somewhat more restricted language model than Lua; either a number or a string can be used as a key in an array, but arrays themselves are not first-class values: an array must have a name, and an array cannot be used as a key in the array.

Regarding the implementation, I have checked the sources for the original Awk as maintained by Brian Kernighan, and the implementation of Awk uses a hash table, not Lua's hybrid array/table structure. The distinction is important because in Lua, when a table is used with consecutive integer keys, the space overhead is the same as for a C array. This is not true for original Awk.

I have not bothered to investigate all later implementations of awk, e.g., Gnu Awk, mawk, and so on.

Eurypterid answered 26/1, 2010 at 1:58 Comment(0)

EDIT: This doesn't answer the question, which was about the implementation.

AWK also did it.

It's interesing how some languages conflate operations that are different in others:

List indexing - a[10]
Associative indexing - a['foo']
Object field access - a.foo
Function/Method calls - a('foo') / a.foo()

Very incomplete examples:

Perl is the rare language where sequential/associative indexing have separate syntax - a[10] / a{'foo'}. AFAIK, object fields map to one of the other operations, depending on which the implementor of the class felt like using.
In Python, all 4 are distinct; sequential/associative indexing use same syntax but separate data types are optimized for them.
In Ruby, object fields are methods with no arguments - a.foo.
In JavaScript, object fields a.foo are syntax sugar for associative indexing a['foo'].
In Lua and AWK, associative arrays are also used for sequential indexing - a[10].
In Arc, sequential and associative indexing looks like function calls - (a 10) / (a "foo"), and I think a.foo is syntax sugar for this too(?).

Emulsifier answered 24/1, 2010 at 2:21 Comment(3)

Fortress and Clojure also treat maps as functions of their keys and arrays as functions of their indices. Which, after all, they are. – Amylolysis 25/1, 2010 at 4:30

The question is about the implementation, not the language model. The original awk, still maintained by Brian Kernighan, uses a hash table. – Eurypterid 26/1, 2010 at 1:53

You are correct, I completely missed the mark! Can't downvote myself, so +1 to your answer. – Emulsifier 27/1, 2010 at 17:23

The closest thing I can think of is Javascript - you create an array with new Array(), and then proceed to index either by number or by string value. It could well be for performance reasons some Javascript implementations choose to do so using two arrays, for the reasons noted in the Lua documentation you linked to.

Fincher answered 24/1, 2010 at 2:33 Comment(0)

ArrayWithHash is a fast implementation of array-hashtable hybrid in C++.

Since C++ is a statically typed language, only integer keys are allowed in ArrayWithHash (no way to insert string or pointer key). In other words, it is something like an array with hash table backup for large indices. Also it uses different hash table implementation which is less memory-efficient than Lua table implementation.

Scrofulous answered 27/7, 2016 at 4:11 Comment(0)

Recommended topics

Hot tags