Are Javascript arrays sparse?
Asked Answered
P

7

108

That is, if I use the current time as an index into the array:

array[Date.getTime()] = value;

will the interpreter instantiate all the elements from 0 to now? Do different browsers do it differently?

I remember there used to be a bug in the AIX kernel, which would create pseudo-ttys on request, but if you did, say, "echo > /dev/pty10000000000" it would create /dev/pty0, /dev/pty1, .... and then fall over dead. It was fun at trade shows, but I don't want this to happen to my customers.

Persia answered 2/10, 2009 at 17:9 Comment(2)
A possible downside to doing this is the difficulty of debugging in Firebug. a log statement on the array will only list the first 1000 elements in the array, which will all be "undefined". Also, array.length will tell you your array has n elements in it, even though n-1 are just "ghost" undefined values.Librarianship
Debugging is now OK in Chrome -- here's an example of console output: [empty × 9564, Object, empty × 105, Object, empty × 10, Object, empty × 12, Object, empty × 9, Object, empty × 21, Object, empty × 9, Object]Improvised
M
42

How exactly JavaScript arrays are implemented differs from browser to browser, but they generally fall back to a sparse implementation - most likely the same one used for property access of regular objects - if using an actual array would be inefficient.

You'll have to ask someone with more knowledge about specific implementations to answer what excatly triggers the shift from dense to sparse, but your example should be perfectly safe. If you want to get a dense array, you should call the constructor with an explicit length argument and hope you'll actually get one.

See this answer for a more detailed description by olliej.

Monadelphous answered 2/10, 2009 at 17:21 Comment(1)
I don't think you actually get a dense array if you say something like foo = new Array(10000). However, this is supposed to work: foo = Array.apply(null, {length: 10});.Danelaw
P
80

Yes, they are. They are actually hash tables internally, so you can use not only large integers but also strings, floats, or other objects. All keys get converted to strings via toString() before being added to the hash. You can confirm this with some test code:

<script>
  var array = [];
  array[0] = "zero";
  array[new Date().getTime()] = "now";
  array[3.14] = "pi";

  for (var i in array) {
      alert("array["+i+"] = " + array[i] + ", typeof("+i+") == " + typeof(i));
  }
</script>

Displays:

array[0] = zero, typeof(0) == string
array[1254503972355] = now, typeof(1254503972355) == string
array[3.14] = pi, typeof(3.14) == string

Notice how I used for...in syntax, which only gives you the indices that are actually defined. If you use the more common for (var i = 0; i < array.length; ++i) style of iteration then you will obviously have problems with non-standard array indices.

Payne answered 2/10, 2009 at 17:17 Comment(11)
Are arrays just a regular JS object with that stores index/values in the normal "properties" mechanism? That's what your code seems to suggest. I would guess then that an JS array just has some additional prototype methods then, as icing on the object cake?Grindelia
most JS implementations store numerically indexed properties in an actual array if possible; that's behind-the-scenes magic, though: from a language standpoint, arrays are regular objects with a magic length propertyMonadelphous
There is also magic in the for...in syntax which hides things like length so you only iterate over the array indices and not an array's properties or methods. One common problem people have is adding things to Object.prototype or Array.prototype which then breaks all for...in loops as these added properties/methods are not hidden.Payne
@John: length is only invisible in for..in loops because it has the DontEnum flag set; in ES5, the property attribute is called enumerable and can be explicitly set via Object.defineProperty()Monadelphous
Your example simply proves that JavaScript arrays are not sparse.Giff
All object keys in JavaScript are always String; anything else you put in the subscript gets toString()-ed. Combine this with the integer imprecision of large Number and it means if you set a[9999999999999999]=1, a[10000000000000000] will be 1 (and many more surprising behaviours). Using non-integers as keys is very unwise, and arbitrary objects are right out.Power
Then shalt thou only use Strings as object keys, no more, no less. String shall be the type thou shalt use, and the type of the key shall be String. Integer shalt thou not use, neither use thou non-integers, excepting that thou then proceed to cast to String. Arbitrary objects are right out.Rina
Array indexes must be integers. array[3.14] = pi works because Array inhereits from Object. Example: var x=[];x[.1] = 5; Then x has a length of 0 still.Petty
Wow. var y = {}; var x = []; x[y] = 5; Then x["[object Object]"] is 5.Petty
@Mike: Wow? JS objects are maps with string keys. JS calls toString() on anything you put between the brackets, that's totally expected. Try arr[0] = "hello"; arr["0"] = "bye"; console.log(arr[0]) and you'll get "bye".Sexcentenary
This is woefully outdated wrt. actual implementations. A lot of optimization can be done (and is allowed) within the ECMAScript specification - the case shown simply proves that such implementations need to fall back to support such use-cases. (Then I suppose there is the whole additional-question of if an array is sparse if it contains a not-set index..)Closefitting
M
42

How exactly JavaScript arrays are implemented differs from browser to browser, but they generally fall back to a sparse implementation - most likely the same one used for property access of regular objects - if using an actual array would be inefficient.

You'll have to ask someone with more knowledge about specific implementations to answer what excatly triggers the shift from dense to sparse, but your example should be perfectly safe. If you want to get a dense array, you should call the constructor with an explicit length argument and hope you'll actually get one.

See this answer for a more detailed description by olliej.

Monadelphous answered 2/10, 2009 at 17:21 Comment(1)
I don't think you actually get a dense array if you say something like foo = new Array(10000). However, this is supposed to work: foo = Array.apply(null, {length: 10});.Danelaw
S
11

You could avoid the issue by using a javascript syntax designed for this sort of thing. You can treat it as a dictionary, yet the "for ... in ... " syntax will let you grab them all.

var sparse = {}; // not []
sparse["whatever"] = "something";
Schiedam answered 2/10, 2009 at 17:29 Comment(0)
P
8

Javascript objects are sparse, and arrays are just specialized objects with an auto-maintained length property (which is actually one larger than the largest index, not the number of defined elements) and some additional methods. You are safe either way; use an array if you need it's extra features, and an object otherwise.

Puff answered 2/10, 2009 at 17:27 Comment(1)
that's from a language standpoint; implementations actually use real arrays to store dense numeric propertiesMonadelphous
F
6

The answer, as is usually true with JavaScript, is "it's a bit wierder...."

Memory usage is not defined and any implementation is allowed to be stupid. In theory, const a = []; a[1000000]=0; could burn megabytes of memory, as could const a = [];. In practice, even Microsoft avoids those implementations.

Justin Love points out, the length attribute is the highest index set. BUT its only updated if the index is an integer.

So, the array is sparse. BUT built-in functions like reduce(), Math.max(), and "for ... of" will walk through the entire range of possible integer indices form 0 to the length, visiting many that return 'undefined'. BUT 'for ... in' loops might do as you expect, visiting only the defined keys.

Here's an example using Node.js:

"use strict";
const print = console.log;

let a = [0, 10];
// a[2] and a[3] skipped
a[4] = 40;
a[5] = undefined;  // which counts towards setting the length
a[31.4] = 'ten pi';  // doesn't count towards setting the length
a['pi'] = 3.14;
print(`a.length= :${a.length}:, a = :${a}:`);
print(`Math.max(...a) = :${Math.max(a)}: because of 'undefined values'`);
for (let v of a) print(`v of a; v=:${v}:`);
for (let i in a) print(`i in a; i=:${i}: a[i]=${a[i]}`);

giving:

a.length= :6:, a = :0,10,,,40,:
Math.max(...a) = :NaN: because of 'undefined values'
v of a; v=:0:
v of a; v=:10:
v of a; v=:undefined:
v of a; v=:undefined:
v of a; v=:40:
v of a; v=:undefined:
i in a; i=:0: a[i]=0
i in a; i=:1: a[i]=10
i in a; i=:4: a[i]=40
i in a; i=:5: a[i]=undefined
i in a; i=:31.4: a[i]=ten pi
i in a; i=:pi: a[i]=3.14

But. There are more corner cases with Arrays not yet mentioned.

Flu answered 29/5, 2019 at 1:52 Comment(0)
H
6

Sparseness (or denseness) can be confirmed empirically for NodeJS with the non-standard process.memoryUsage().

Sometimes node is clever enough to keep the array sparse:

Welcome to Node.js v12.15.0.
Type ".help" for more information.
> console.log(`The script is using approximately ${Math.round(process.memoryUsage().heapUsed / 1024 / 1024 * 100) / 100} MB`)
The script is using approximately 3.07 MB
undefined
> array = []
[]
> array[2**24] = 2**24
16777216
> array
[ <16777216 empty items>, 16777216 ]
> console.log(`The script is using approximately ${Math.round(process.memoryUsage().heapUsed / 1024 / 1024 * 100) / 100} MB`)
The script is using approximately 2.8 MB
undefined

Sometimes node chooses to make it dense (this behavior might well be optimized in future):

> otherArray = Array(2**24)
[ <16777216 empty items> ]
> console.log(`The script is using approximately ${Math.round(process.memoryUsage().heapUsed / 1024 / 1024 * 100) / 100} MB`)
The script is using approximately 130.57 MB
undefined

Then sparse again:

> yetAnotherArray = Array(2**32-1)
[ <4294967295 empty items> ]
> console.log(`The script is using approximately ${Math.round(process.memoryUsage().heapUsed / 1024 / 1024 * 100) / 100} MB`)
The script is using approximately 130.68 MB
undefined

So perhaps using a dense array to get a feel for the original AIX kernel bug might need to be forced with a range-alike:

> denseArray = [...Array(2**24).keys()]
[
   0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11,
  12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
  24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
  36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
  48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
  60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
  72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
  84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
  96, 97, 98, 99,
  ... 16777116 more items
]
> console.log(`The script is using approximately ${Math.round(process.memoryUsage().heapUsed / 1024 / 1024 * 100) / 100} MB`);
The script is using approximately 819.94 MB
undefined

Because why not make it fall over?

> tooDenseArray = [...Array(2**32-1).keys()]

<--- Last few GCs --->

[60109:0x1028ca000]   171407 ms: Scavenge 1072.7 (1090.0) -> 1056.7 (1090.0) MB, 0.2 / 0.0 ms  (average mu = 0.968, current mu = 0.832) allocation failure 
[60109:0x1028ca000]   171420 ms: Scavenge 1072.7 (1090.0) -> 1056.7 (1090.0) MB, 0.2 / 0.0 ms  (average mu = 0.968, current mu = 0.832) allocation failure 
[60109:0x1028ca000]   171434 ms: Scavenge 1072.7 (1090.0) -> 1056.7 (1090.0) MB, 0.2 / 0.0 ms  (average mu = 0.968, current mu = 0.832) allocation failure 


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x100931399]
    1: StubFrame [pc: 0x1008ee227]
    2: StubFrame [pc: 0x100996051]
Security context: 0x1043830808a1 <JSObject>
    3: /* anonymous */ [0x1043830b6919] [repl:1] [bytecode=0x1043830b6841 offset=28](this=0x104306fc2261 <JSGlobal Object>)
    4: InternalFrame [pc: 0x1008aefdd]
    5: EntryFrame [pc: 0x1008aedb8]
    6: builtin exit frame: runInThisContext(this=0x104387b8cac1 <ContextifyScript map = 0x1043...

FATAL ERROR: invalid array length Allocation failed - JavaScript heap out of memory

Writing Node.js report to file: report.20200220.220620.60109.0.001.json
Node.js report completed
 1: 0x10007f4b9 node::Abort() [/Users/pzrq/.nvm/versions/node/v12.15.0/bin/node]
 2: 0x10007f63d node::OnFatalError(char const*, char const*) [/Users/pzrq/.nvm/versions/node/v12.15.0/bin/node]
 3: 0x100176a27 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/Users/pzrq/.nvm/versions/node/v12.15.0/bin/node]
 4: 0x1001769c3 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/Users/pzrq/.nvm/versions/node/v12.15.0/bin/node]
 5: 0x1002fab75 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/Users/pzrq/.nvm/versions/node/v12.15.0/bin/node]
 6: 0x1005f3e9b v8::internal::Runtime_FatalProcessOutOfMemoryInvalidArrayLength(int, unsigned long*, v8::internal::Isolate*) [/Users/pzrq/.nvm/versions/node/v12.15.0/bin/node]
 7: 0x100931399 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/Users/pzrq/.nvm/versions/node/v12.15.0/bin/node]
 8: 0x1008ee227 Builtins_IterableToList [/Users/pzrq/.nvm/versions/node/v12.15.0/bin/node]
Abort trap: 6
Hannie answered 20/2, 2020 at 11:16 Comment(1)
Nice, and I’m kind of amazed my ten year old question is still relevant!Persia
I
1

They can be but they don't always have to be, and they can perform better when they're not.

Here's a discussion about how to test for index sparseness in an array instance: https://benmccormick.org/2018/06/19/code-golf-sparse-arrays/

This code golf (fewest characters) winner is:

let isSparse = a => !!a.reduce(x=>x-1,a.length)

Basically walking the array for indexed entries while decrementing the length value and returning the hardened !! boolean of the falsy/truthy numerical result (if the accumulator is decremented all the way to zero, the index is fully populated and not sparse). Charles Merriam's caveats above should be considered as well and this code doesn't address them, but they apply to hashed string entries which can happen when assigning elements with arr[var]= (something) where var wasn't an integer.

On reason to care about index sparseness is its effects on performance, which can differ between script engines, there's a great discussion about array creation/.initialization here: What’s the difference between "Array()" and "[]" while declaring a JavaScript array?

A recent answer to that post has a link to this deep dive into how V8 tries to optimize arrays by tagging them to avoid (re-)testing for characteristics like sparseness: https://v8.dev/blog/elements-kinds. The blog post is from Sept '17 and the material is subject to some change, but the breakdown to implications for day-to-day development is useful and clear.

Ictus answered 20/12, 2019 at 15:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.