Fundamentally, because the specification says so:
string value
primitive value that is a finite ordered sequence of zero or more 16-bit unsigned integer values
The specification also defines that there are String objects, as distinct from primitive strings. (Similarly there are primitive number
, boolean
, and symbol
types, and Number and Boolean and Symbol objects.)
Primitive strings follow all the rules of other primitives. At a language level, they're treated exactly the way primitive numbers and booleans are. For all intents and purposes, they are primitive values. But as you say, it would be insane for a = b
to literally make a copy of the string in b
and put that copy in a
. Implementations don't have to do that because primitive string values are immutable (just like primitive number values). You can't change any characters in a string, you can only create a new string. If strings were mutable, the implementation would have to make a copy when you did a = b
(but if they were mutable the spec would be written differently).
Note that primitive strings and String objects really are different things:
const s = "hey";
const o = new String("hey");
// Here, the string `s` refers to is temporarily
// converted to a string object so we can perform an
// object operation on it (setting a property).
s.foo = "bar";
// But that temporary object is never stored anywhere,
// `s` still just contains the primitive, so getting
// the property won't find it:
console.log(s.foo); // undefined
// `o` is a String object, which means it can have properties
o.foo = "bar";
console.log(o.foo); // "bar"
So why have primitive strings? You'd have to ask Brendan Eich (and he's reasonably responsive on Twitter), but I suspect it was so that the definition of the equivalence operators (==
, ===
, !=
, and !==
) didn't have to either be something that could be overloaded by an object type for its own purposes, or special-cased for strings.
So why have string objects? Having String objects (and Number objects, and Boolean objects, and Symbol objects) along with rules saying when a temporary object version of a primitive is created make it possible to define methods on primitives. When you do:
console.log("example".toUpperCase());
in specification terms, a String object is created (by the GetValue operation) and then the property toUpperCase
is looked up on that object and (in the above) called. Primitive strings therefore get their toUpperCase
(and other standard methods) from String.prototype
and Object.prototype
. But the temporary object that gets created is not accessible to code except in some edge cases,¹ and JavaScript engines can avoid literally creating the object outside of those edge cases. The advantage to that is that new methods can be added to String.prototype
and used on primitive strings.
¹ "What edge cases?" I hear you ask. The most common one I can think of is when you've added your own method to String.prototype
(or similar) in loose mode code:
Object.defineProperty(String.prototype, "example", {
value() {
console.log(`typeof this: ${typeof this}`);
console.log(`this instance of String: ${this instanceof String}`);
},
writable: true,
configurable: true
});
"foo".example();
// typeof this: object
// this instance of String: true
There, the JavaScript engine was forced to create the String object because this
can't be a primitive in loose mode.
Strict mode makes it possible to avoid creating the object, because in strict mode this
isn't required to be an object type, it can be a primitive (in this case, a primitive string):
"use strict";
Object.defineProperty(String.prototype, "example", {
value() {
console.log(`typeof this: ${typeof this}`);
console.log(`this instance of String: ${this instanceof String}`);
},
writable: true,
configurable: true
});
"foo".example();
// typeof this: string
// this instanceof String: false