I have run into what is, to me, some serious weirdness with string behavior in Firefox when using the .normalize()
Unicode normalization function.
Here is a demo, view the console in Firefox to see the problem.
Suppose I have a button with an id of "NFKC":
<button id="NFKC">NFKC</button>
Get a reference to that, easy enough:
document.querySelector('#NFKC')
// <button id="NFKC">
Now, since this button has an id of NFKC, which we can get at that string as follows:
document.body.querySelector('#NFKC').id
// "NFKC"
Stick that string in a variable:
var s1 = document.body.querySelector('#NFKC').id
By way of comparison, assign the very same string to a variable directly:
var s2 = 'NFKC'
So of course:
s1 === s2
// true
And:
s1 == s2
// true
Now’s the part where my head explodes.
To normalize a string, you pass one of NFC
, NFD
, NFKC
, or NFKD
to .normalize()
, like this:
'á'.normalize('NFKC')
// "á"
Of course, depending on the normalization form you choose, you get different codepoints, but whatever.
'á'.normalize('NFC').length == 1
// true
'á'.normalize('NFD').length == 2
// true
But whatever. The point is, pass one of four strings corresponding to normalization forms to .normalize()
, and you'll get a normalized string back.
Since we know that s1
(the string we retrieved from the DOM) and s2
are THE SAME STRING (s1 === s2
is true
), then obviously we can use either to normalize a string:
'á'.normalize(s2)
"á"
// well yeah, because s2 IS 'NFKC'.
Naturally, s1
will behave exactly the same way, right?
'á'.normalize(s1)
// RangeError: form must be one of 'NFC', 'NFD', 'NFKC', or 'NFKD'
Nope.
So the question is: why does it appear that s1
is not equal to s2
as far as .normalize()
is concerned, when s1 === s2
is true?
This doesn’t happen in Chrome, the only other browser I’ve tested so far.
UPDATE
This was a bug in Firefox and has been fixed.
s1
isn't ===s2
. I will make the question more obvious. – Bin'á'.normalize(s1)
), in FF36. – FloretString.prototype.normalize
for Firefox. Looks like the problem is related to howeverformStr
is set. – Floretvar s2 = 'NFKC'.split('').join('');
andvar s2= 'NFKCabc'.replace('abc','');
. But this doesn't:var s2= 'N'+'F'+'K'+'C';
. Weird. – Mcgill