Why Javascript ===/== string equality sometimes has constant time complexity and sometimes has linear time complexity?
Asked Answered
B

3

10

After I found that the common/latest Javascript implementations are using String Interning for perfomance boost (Do common JavaScript implementations use string interning?), I thought === for strings would get the constant O(1) time. So I gave a wrong answer to this question:

JavaScript string equality performance comparison

Since according to the OP of that question it is O(N), doubling the string input doubles the time the equality needs. He didn't provide any jsPerf so more investigation is needed,

So my scenario using string interning would be:

var str1 = "stringwithmillionchars"; //stored in address 51242

var str2 = "stringwithmillionchars"; //stored in address 12313

The "stringwithmillionchars" would be stored once let's say in address 201012 of memory and both str1 and str2 would be "pointing" to this address 201012. This address could then be determined with some kind of hashing to map to specific locations in memory.

So when doing

"stringwithmillionchars" === "stringwithmillionchars"

would look like

getContentOfAddress(51242)===getContentOfAddress(12313)

or 201012 === 201012

which would take O(1)/constant time

JSPerfs/Performance updates:

JSPerf seems to show constant time even if the string is 16 times longer?? Please have a look:

http://jsperf.com/eqaulity-is-constant-time

Probably the strings are too small on the above: This probably show linear time (thanks to sergioFC) the strings are built with a loop. I tried without functions - still linear time / I changed it a bit http://jsfiddle.net/f8yf3c7d/3/ .

According to https://www.dropbox.com/s/8ty3hev1b109qjj/compare.html?dl=0 (12MB file that sergioFC made) when you have a string and you already have assigned the value in quotes no matter how big the t1 and t2 are (e.g 5930496 chars), it is taking it 0-1ms/instant time.

It seems that when you build a string using a for loop or a function then the string is not interned. So interning happens only when you directly assign a string with quotes like var str = "test";

Basso answered 24/10, 2014 at 14:14 Comment(15)
I think its because === operator compares memory addresses only when comparing objects (similar to Java). But "something" its not an object, its type is the builtin string. The same as comparing numbers, var a=2; var b=2;, if you do a===b you are not comparing objects nor memory adresses.Sillsby
Thanks! Could you think of a reason why strings are not objects internally so equality would be blazing fast?Basso
Some languages (such as Java) provides both posibilities: primitive types (int) and wrapper objects (Integer); I don't know why javascript doesn't have something similar.Sillsby
I know you can do var str = new String("test"); but I don't know the implications there either..Basso
Even doing so typeof str would be 'string', not object.Sillsby
So you are saying that my strings are very small?Basso
I have removed the fiddle in order to not crush more browsers. I think they are too small. Important: i have just testd that a comparaison of two equals strings with 5930496chars constructed using var s1='...';var s2='...'; tooks 0ms, while the comparaison of the same string constructed char by char tooks 20ms.Sillsby
Interesting! why is that? Do you have a link?Basso
Don't know, I'm not familiar with interning. The strings are so long that the file is 12MB. I'm gonna upload it to Dropbox and update this comment with the link.Sillsby
I cannot even see what you are doing in the function - is it a single line? I can't see the console.logs or the end of the function :PBasso
In the last lines of the file, after the vars declaration there is a comparation and the button that calls the function.Sillsby
Can you change the console.logs a bit? I get 0 - true 5930496 - 5930496 and I don't see any msecs or what is the size for the first comparison - I assume you are doing 3 vs 5930496 char comparisons?Basso
0ms. True=> strings are equal. The other log is the lenght of s1 and s2 which is not really useful..Sillsby
I got it is only one.. it seems the jsFiddle overhead is because of calling funtions and passing string / copying around.. I will update it ..Basso
I was trying through ChromeBug - I opened it in Sublime.. this is interesting - so having it as a constant it is 0ms otherwise it takes considerable amount of time for ===Basso
B
12

Based on all the Performance Tests (see original post) for strings a and b the operation a === b takes:

  • constant time O(1) if the strings are interned. From the examples it seems that interning only happens with directly assigned strings like var str = "test"; and not if you build it with concatenation using for-loops or functions.

  • linear time O(N) since in all the other cases the length of the two strings is compared first. If it is equal then we have character by character comparison. Else of course they are not equal. N is the length of the string.

Basso answered 24/10, 2014 at 19:19 Comment(0)
A
3

According to the ECMAScript 5.1 Specification's Strict Equal Comparison algorithm, even if the type of Objects being compared is String, all the characters are checked to see if they are equal.

  1. If Type(x) is String, then return true if x and y are exactly the same sequence of characters (same length and same characters in corresponding positions); otherwise, return false.

Interning is strictly an implementation thingy, to boost performance. The language standard doesn't impose any rules in that regard. So, its up to the implementers of the specification to intern strings or not.

Atmosphere answered 24/10, 2014 at 14:28 Comment(13)
Thanks. Could you think of a reason why this is a better choice than handling strings as objects two? Is there such an overhead?Basso
I fail to see how this forces the engine to actually make the full string comparison. If two variables point to the same string kept somewhere in memory, then they will satisfy this requirement.Thoroughbred
@MichailMichailidis Interning is strictly an implementation thingy, to boost performance. The language standard doesn't impose any rules in that regard.Atmosphere
Well we are talking about the latest/common implementations- Is it actually being used? Can someone more familiar with jsperf make one please? Case1: "test"==="test", Case2: "testtest"==="testtest" (double the length) Case3: strings are different but same length and Case 4: different and same length.Basso
@MichailMichailidis Why don't you do one yourself? It's not hard an it's your question.Thoroughbred
@MichailMichailidis Simply with jsperf we cannot conclude that they use interning or not. We need to understand the implementation.Atmosphere
Downvoter, I would love to know what is wrong with this answer.Atmosphere
@Atmosphere That was me and I retracted it. I still think your reasoning is wrong but your answer definitely don't deserve a downvote. I am sorry for this, this was a mistake quickly corrected. What's wrong - see my comment above.Thoroughbred
@Thoroughbred Looks like I answered you as well, in the comment next to that.Atmosphere
@Atmosphere No, sorry. I did not refer to interning at all. Your point is that "the implementations still have to check all the characters" and I don't agree that it follows from the section you quoted. You can achieve this "same length/same chars" without full comparison.Thoroughbred
@Thoroughbred You are correct. I changed it a little bit now so that the second paragraph and the first one will not contradict each other.Atmosphere
@Atmosphere Looks better, thank you for taking my point into consideration.Thoroughbred
@Thoroughbred I made one - I removed the function wrappersBasso
T
1

First of all, it would be nice to see a JSPerf test which demonstrates the claim that doubling the string size doubles the execution time.

Next, let's take that as granted. Here's my (unproven, unchecked, and probably unrelated to reality) theory.

Compairing two memory addresses is fast, no matter how much data is references. But you have to INTERN this strings first. If you have in your code

var a = "1234";
var b = "1234";

Then the engine first has to understand that these two strings are the same and can point to the same address. So at least once these strings has to be compared fully. So basically here are the following options:

  • The engine compares and interns strings directly when parsing the code. In this case equals strings should get the same address.
  • The engine may say "these strings are two big, I don't want to intern them" and has two copies.
  • The engine may intern these strings later.

In the two latter cases string comparison will influence the test results. In the last case - even if the strings are finally interned.

But as I wrote, a wild theory, for theory's sage. I'd first like to see some JSPerf.

Thoroughbred answered 24/10, 2014 at 14:34 Comment(1)
I agree with you - The original poster of the other question said it and I took it for granted - I will soon prepare a JSPerf to elaborate more on the numbers if he doesn't.Basso

© 2022 - 2024 — McMap. All rights reserved.