How are Lua tables handled in memory?
Asked Answered
B

2

8

How does lua handle a table's growth?

Is it equivalent to the ArrayList in Java? I.e. one that needs continuous memory space, and as it grows bigger than the already allocated space, the internal array is copied to another memory space.

Is there a clever way to led with that?

My question is, how is a table stored in the memory? I'm not asking how to implement arrays in Lua.

Bondon answered 28/4, 2015 at 19:37 Comment(3)
We implement arrays in Lua simply by indexing tables with integers. Therefore, arrays do not have a fixed size, but grow as we need. lua.org/pil/11.1.htmlMorelock
@RobertHarvey as I have edited my question, my doubt is how Lua handles this is "memory vision".Bondon
"array" is an "under the hood" term. A Lua table is a set of key-value pairs where neither the key nor the value can be nil. If a table has positive integer keys that are contiguous and start at 1, the table is said to "have a sequence." The length operator # only applies to sequences (or "empty sequences") and ipairs ends before the first missing positive integer key. The implementation possibilities follow from that.Partlet
S
10

(Assuming you're referring to recent versions of Lua; describing the behavior of 5.3 which should be (nearly?) the same for 5.0-5.2.)

Under the hood, a table contains an array and a hash part. Both (independently) grow and shrink in power-of-two steps, and both may be absent if they aren't needed.

Most key-value pairs will be stored in the hash part. However, all positive integer keys (starting from 1) are candidates for storing in the array part. The array part stores only the values and doesn't store the keys (because they are equivalent to an element's position in the array). Up to half of the allocated space is allowed to be empty (i.e. contain nils – either as gaps or as trailing free slots). (Array candidates that would leave too many empty slots will instead be put into the hash part. If the array part is full but there's leftover space in the hash part, any entries will just go to the hash part.)

For both array and hash part, insertions can trigger a resize, either up to the next larger power of two or down to any smaller power of two if sufficiently many entries have been removed previously. (Actually triggering a down-resize is non-trivial: rehash is the only place where a table is resized (and both parts are resized at the same time), and it is only called from luaH_newkey if there wasn't enough space in either of the two parts1.)

For more information, you can look at chapter 4 of The Implementation of Lua 5.0, or inspect the source: Basically everything of relevance happens in ltable.c, interesting starting points for reading are rehash (in ltable.c) (the resizing function), and the main interpreter loop luaV_execute (in lvm.c) or more specifically luaV_settable (also there) (what happens when storing a key-value pair in a table).


1As an example, in order to shrink a table that contained a large array part and no hash, you'd have to clear all array entries and then add an entry to the hash part (i.e. using a non-integer key, the value may be anything including nil), to end up with a table that contains no array part and a 1-element hash part.
If both parts contained entries, you'd have to first clear the hash part, then add enough entries to the array part to fill both array and hash combined (to trigger a resize which will leave you with a table with a large array part and no hash), and subsequently clear the array as above.2 (First clearing the array and then the hash won't work because after clearing both parts you'll have no array and a huge hash part, and you cannot trigger a resize because any entries will just go to the hash.)

2Actually, it's much easier to just throw away the table and make a new one. To ensure that a table will be shrunk, you'd need to know the actual allocated capacity (which is not the current number of entries, and which Lua won't tell you, at least not directly), then get all the steps and all the sizes just right – mix up the order of the steps or fail to trigger the resize and you'll end up with a huge table that may even perform slower if you're using it as an array… (Array candidates stored in the hash also store their keys, for half the amount of useful data in e.g. a cache line.)

Sori answered 28/4, 2015 at 21:21 Comment(4)
I believe only insertions can trigger a resize. This is why it is allowed to clear existing fields during an iteration using next(), but it is undefined if you add fields.Novah
First, thanks for your help, but sorry, i'm dummy in C, I have read that file Itable.c but i'm didn't understand some things: what happens if a key hash colide? if the hash part grows (or shrink) is it copied to another part of memory (to preserve continuous allocation)? I can't find a detailed explaination about thatBondon
@JohnnyWiller: The hashing is (briefly) explained in chapter 4 of the implementation paper (see page 8, last paragraph of that chapter) and more information can be found in the paper referenced there. As for resizing: Did you read the implementation paper? It's described there (page 7, 2nd paragraph) – yes copied, yes contiguous.Sori
@Novah Right, completely skipped thinking on that part – thanks for noticing! Should be fixed now, and after more reading & testing I'm pretty sure that what I'm describing now is really what happens.Sori
S
1

Since Lua 5.0, tables are an hybrid of hash table and array. From The Implementation of Lua 5.0:

New algorithm for optimizing tables used as arrays:

Unlike other scripting languages, Lua does not offer an array type. Instead, Lua programmers use regular tables with integer indices to implement arrays. Lua 5.0 uses a new algorithm that detects whether tables are being used as arrays and automatically stores the values associated to numeric indices in an actual array, instead of adding them to the hash table. This algorithm is discussed in Section 4.

Prior versions had only the hash table.

Septuplet answered 28/4, 2015 at 19:53 Comment(3)
@JohnnyWiller What about it? The GC scans both the array part and the hash part of the table.Gerkman
@ColonelThirtyTwo yes, I know that. I was asking about if table hash part grows, is it copied to another part of the memory, to preserve continuous allocation (like java do)?Bondon
@JohnnyWiller After computing the new sizes, Lua creates the new parts and re-inserts the elements from the old parts into the new ones So I guess they remain contiguous after growth. Read Section 4 of the file I linked, they explain it better than I can :-)Septuplet

© 2022 - 2024 — McMap. All rights reserved.