Is a JavaScript array index a string or an integer?
Asked Answered
F

5

21

I had a generic question about JavaScript arrays. Are array indices in JavaScript internally handled as strings?

I read somewhere that because arrays are objects in JavaScript, the index is actually a string. I am a bit confused about this, and would be glad for any explanation.

Forgiving answered 18/12, 2014 at 1:13 Comment(0)
T
6

That is correct so:

> var a = ['a','b','c']
undefined
> a
[ 'a', 'b', 'c' ]
> a[0]
'a'
> a['0']
'a'
> a['4'] = 'e'
'e'
> a[3] = 'd'
'd'
> a
[ 'a', 'b', 'c', 'd', 'e' ]
Tandem answered 18/12, 2014 at 1:21 Comment(3)
for (var i in a) console.log(typeof i) shows 'string' for all indexes.Farthing
Yes, but [ 'a', 'b', 'c' ].map((_, i) => typeof i) returns [ 'number', 'number', 'number' ].Decorum
@Decorum Array.prototype.map must be returning it's own arguments to the callback. According to MDN: The for...in statement iterates over all enumerable properties of an object that are keyed by strings (ignoring ones keyed by Symbols), including inherited enumerable properties.Overweary
S
21

Formally, all property names are strings. That means that array-like numeric property names really aren't any different from any other property names.

If you check step 6 in the relevant part of the spec, you'll see that property accessor expressions are always coerced to strings before looking up the property. That process is followed (formally) regardless of whether the object is an array instance or another sort of object. (Again, it just has to seem like that's what's happening.)

Now, internally, the JavaScript runtime is free to implement array functionality any way it wants.

edit — I had the idea of playing with Number.toString to demonstrate that a number-to-string conversion happens, but it turns out that the spec explicitly describes that specific type conversion as taking place via an internal process, and not by an implicit cast followed by a call to .toString() (which probably is a good thing for performance reasons).

Shelba answered 18/12, 2014 at 1:15 Comment(15)
Curiosity killed the cat: Could you provide some reference to that please? I recall that positive integers below 2^32 were integers, everything else a string hashlookup (just talking about array though)..Sycophancy
Yeah, seen that, that was fastSycophancy
@Sycophancy I look through the spec a lot so it's in my browser's URL completion list right at the top :)Shelba
@Shelba In that case, an integer index should be handled as a string, as it is a property of the array, which is a special type of JS object.Forgiving
@Forgiving right - numeric values used as property references via the [ ] operator are converted to strings, or at least the spec says that the conversion step has to happen. You gave me an idea, so I'll extend the answer.Shelba
Hmm, I read your link. Being baffled, I remember A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 23^2-1. from the spec 15.4.. Also, #12766922 Did this change?Sycophancy
@Sycophancy I think the deal is that those are the only ones that count as "real" indexes - in other words, the only ones that affect the value of .length.Shelba
@Forgiving well what I wanted to do was play with Number.toString, but it turns out that in the special case of number to string conversion, there's an internal process and no call to the actual .toString function is made. That's something I didn't know.Shelba
'real' indexes are what are 'real' array's (to javascript). Everything else are thus properties added to the object (that links to the array). At least, that was my understanding.Sycophancy
@Sycophancy right -- if you think about it, really the only special thing about arrays in JavaScript are the somewhat magical things that happen to the .length property.Shelba
Yes, that is my point. That 'magic' is what makes the differences between an object and an array (in javascript terms). In other words, when .length or new array(4294967300) Stops working. However nitpicky, there is a difference, or what am I missing? Also, what would array.push return after we overflow 32bit unsigned?Sycophancy
@GitaarLAB: Try it (warning: it's not pretty): var arr = []; arr[4294967296] = 42; arr.push(43); console.dir(arr);Chunky
@Felix: I've got to much in open memory right now, I genuinely fear this would crash at least my browser.. I'd be curious though what is does on your computer :). Also, the question asks about array indices, whilst this answer talks about object properties. Under the same spec linked to in 15.4 the spec clearly states: A property name P (in the form of a String value) is an **array index** if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 2^32−1. Otherwise we are talking about object properties.Sycophancy
@GitaarLAB: Now. After arr[4294967294] = 42;, arr.length correctly shows 4294967295. However, calling arr.push(21); throws a RangeError: Invalid array length. arr[arr.length] = 21 works, but doesn't change length.Chunky
@FelixKling: +1. Thank you for testing (and confirming my beliefs)! Meanwhile I've been typing my own answer, hopefully with the rest of the answers future visitors get a complete picture.Sycophancy
S
7

Yes, technically array-indexes are strings, but as Flanagan elegantly put it in his 'Definitive guide': "It is helpful to clearly distinguish an array index from an object property name. All indexes are property names, but only property names that are integers between 0 and 232-1 are indexes."

Usually you should not care what the browser (or more in general 'script-host') does internally as long as the outcome conforms to a predictable and (usually/hopefully) specified result. In fact, in case of JavaScript (or ECMAScript 262) is only described in terms of what conceptual steps are needed. That (intentionally) leaves room for script-host (and browsers) to come up with clever smaller and faster way's to implement that specified behavior.

In fact, modern browsers use a number of different algorithms for different types of arrays internally: it matters what they contain, how big they are, if they are in order, if they are fixed and optimizable upon (JIT) compile-time or if they are sparse or dense (yes it often pays to do new Array(length_val) instead of ninja []).

In your thinking-concept (when learning JavaScript) it might help to know that arrays are just special kind of objects. But they are not always the same thing one might expect, for example:

var a=[];
a['4294967295']="I'm not the only one..";
a['4294967296']="Yes you are..";
alert(a);  // === I'm not the only one..

although it is easy and pretty transparent to the uninformed programmer to have an array (with indexes) and attach properties to the array-object.

The best answer (I think) is from the specification (15.4) itself:

Array Objects

Array objects give special treatment to a certain class of property names. A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 232−1. A property whose property name is an array index is also called an element. Every Array object has a length property whose value is always a nonnegative integer less than 232. The value of the length property is numerically greater than the name of every property whose name is an array index; whenever a property of an Array object is created or changed, other properties are adjusted as necessary to maintain this invariant. Specifically, whenever a property is added whose name is an array index, the length property is changed, if necessary, to be one more than the numeric value of that array index; and whenever the length property is changed, every property whose name is an array index whose value is not smaller than the new length is automatically deleted. This constraint applies only to own properties of an Array object and is unaffected by length or array index properties that may be inherited from its prototypes.

An object, O, is said to be sparse if the following algorithm returns true:

  1. Let len be the result of calling the [[Get]] internal method of O with argument "length".

  2. For each integer i in the range 0≤i<ToUint32(len)

    a. Let elem be the result of calling the [[GetOwnProperty]] internal method of O with argument ToString(i). b. If elem is undefined, return true.

  3. Return false.

Effectively the ECMAScript 262 spec just ensures to the JavaScript-programmer unambiguous array-references regardless of getting/setting arr['42'] or arr[42] up to 32-bit unsigned.

The main difference is for example (auto-updating of) array.length, array.push and other array-sugar like array.concat, etc. While, yes, JavaScript also lets one loop over the properties one has set to an object, we can not read how much we have set (without a loop). And yes, to the best of my knowledge, modern browsers (especially chrome in what they call (but don't exactly specify)) 'small integers' are wicked fast with true (pre-initialized) small-int arrays.

Also see for example this related question.

Edit: as per @Felix Kling's test (from his comment above):

After arr[4294967294] = 42;, arr.length correctly shows 4294967295. However, calling arr.push(21); throws a RangeError: Invalid array length. arr[arr.length] = 21 works, but doesn't change length.

The explanation for this (predictable and intended) behavior should be clear after this answer.

Edit2:

Now, someone gave the comment:

for (var i in a) console.log(typeof i) shows 'string' for all indexes.

Since for in is the (unordered I must add) property iterator in JavaScript, it is kind of obvious it returns a string (I'd be pretty darned if it didn't).

From MDN:

for..in should not be used to iterate over an Array where index order is important.

Array indexes are just enumerable properties with integer names and are otherwise identical to general Object properties. There is no guarantee that for...in will return the indexes in any particular order and it will return all enumerable properties, including those with non–integer names and those that are inherited.

Because the order of iteration is implementation dependent, iterating over an array may not visit elements in a consistent order. Therefore it is better to use a for loop with a numeric index (or Array.forEach or the for...of loop) when iterating over arrays where the order of access is important.

So.. what have we learned? If order is important to us (often is with arrays), then we need this quirky array in JavaScript, and having a 'length' is rather useful for looping in numerical order.

Now think of the alternative: Give your objects an id/order, but then you'd need to loop over your objects for every next id/order (property) once again...

Edit 3:

Someone answered along the lines of:

var a = ['a','b','c'];
a['4'] = 'e';
a[3] = 'd';
alert(a); // returns a,b,c,d,e

Now using the explanation in my answer: what happened is that '4' is coercible to integer 4 and that is in the range [0, 4294967295] making it into a valid array index also called element. Since var a is an array ([]), the array element 4 gets added as array element, not as property (what would have happened if var a was an object ({}).

An example to further outline the difference between array and object:

var a = ['a','b','c'];
a['prop']='d';
alert(a);

see how it returns a,b,c with no 'd' to be seen.

Edit 4:

You commented: "In that case, an integer index should be handled as a string, as it is a property of the array, which is a special type of JavaScript object." That is wrong in terms of terminology because: (strings representing) integer indexes (between [0, 4294967295]) create array indexes or elements; not properties.

It's better to say: Both an actual integer and a string representing an integer (both between [0, 4294967295]) is a valid array index (and should conceptually be regarded as integer) and creates/changes array elements (the 'things'/values (only) that get returned when you do arr.join() or arr.concat() for example).

Everything else creates/changes a property (and should conceptually be regarded as string). What the browser really does, usually shouldn't interest you, noting that the simpler and clearer specified you code, the better chance the browser has to recognize: 'oh, let’s optimize this to an actual array under the hood'.

Sycophancy answered 18/12, 2014 at 2:19 Comment(1)
No, and I'm not the only one who says so: from Dr. Axel Rauschmayer's blog: array indices in JavaScript are actually strings. Naturally, engines perform optimizations under the hood so that, internally, that is not true. But it is how the spec defines them and Pretend array indices are numbers. That’s what usually happens under the hood and the general direction in which ECMAScript is moving. Effectively the ECMAScript 262 spec just ensures to the user unambiguous array-references regardless of getting/setting '9' or 9 up to 32bit UnsignedSycophancy
T
6

That is correct so:

> var a = ['a','b','c']
undefined
> a
[ 'a', 'b', 'c' ]
> a[0]
'a'
> a['0']
'a'
> a['4'] = 'e'
'e'
> a[3] = 'd'
'd'
> a
[ 'a', 'b', 'c', 'd', 'e' ]
Tandem answered 18/12, 2014 at 1:21 Comment(3)
for (var i in a) console.log(typeof i) shows 'string' for all indexes.Farthing
Yes, but [ 'a', 'b', 'c' ].map((_, i) => typeof i) returns [ 'number', 'number', 'number' ].Decorum
@Decorum Array.prototype.map must be returning it's own arguments to the callback. According to MDN: The for...in statement iterates over all enumerable properties of an object that are keyed by strings (ignoring ones keyed by Symbols), including inherited enumerable properties.Overweary
E
3

Let's see:

[1]["0"] === 1 // true

Oh, but that's not conclusive, since the runtime could be coercing "0" to +"0" and +"0" === 0.

[1][false] === undefined // true

Now, +false === 0, so no, the runtime isn't coercing the value to a number.

var arr = [];
arr.false = "foobar";
arr[false] === "foobar" // true

So actually, the runtime is coercing the value to a string. So yep, it's a hash table lookup (externally).

Esoterica answered 18/12, 2014 at 1:24 Comment(3)
This is completely new to me. I used to think a JS array index was like indices of arrays in other languages.Forgiving
Keep in mind that internally the runtime is likely to represent the array as a traditional array to boost the performance. But to the user, an array is just an object.Esoterica
"[...] as a traditional array" — except that you could have a very large number of gaps or very large indices, so a map is probably going to be working better in all cases (and also the index can be a negative decimal number like "-3.3"). But testing with those, it looks like Firefox doesn't show elements other than integer keyed elements as part of the array. Interesting...Acuna
K
2

In JavaScript there are two type of arrays: standard arrays and associative arrays (or an object with properies)

  • [ ] - standard array - 0 based integer indexes only
  • { } - associative array - JavaScript objects where keys can be any strings

So ...

var arr = [ 0, 1, 2, 3 ];

... is defined as a standard array where indexes can only be integers. When you do arr["something"] since something (which is what you use as index) is not an integer you are basically defining a property to the arr object (everything is object in JavaScript). But you are not adding an element to the standard array.

Kentledge answered 18/12, 2014 at 1:17 Comment(3)
JavaScript objects behave in many ways like "associative arrays", but they're really not the same thing and the spec never uses that terminology.Shelba
I just adjusted the use of that terminology.Kentledge
It's probably more accurate to portray Arrays as a type of Object rather than the other way around.Farthing

© 2022 - 2024 — McMap. All rights reserved.