Why does parseInt('dsff66',16) return 13?
Asked Answered
D

4

27

today I stumbled on a strange (in my opinion) case in JavaScript. I passed a non-hexadecimal string to the parseInt function with the base of 16 and...I got the result. I would expect the function to throw some kind of exception or at least return NaN, but it succeeded parsing it and returned an int.

My call was:

var parsed = parseInt('dsff66', 16); // note the 's' in the first argument
document.write(parsed);

and the result was: 13.

I noticed that it "stops" parsing with the first character that doesn't belong to the numeral system specified in the 2nd argument, so calling parseInt('fg',16) I would get 15 as a result.

In my opinion, it should return NaN. Can anyone explain to me why it doesn't? Why would anyone want this function to behave like this (return an integer even if it isn't the precise representation of the string passed) ?

Dinghy answered 24/9, 2014 at 12:50 Comment(16)
Note that if you do something like parseInt("2foo",10) you get 2. This is as expected. It parses up to the first non-numeric characterRainie
In my opinion, it should return NaN. Can anyone explain to me why it doesn't? Because the spec says it must not. And unfortunately, my opinion or yours do not matter to the spec.Auscultation
If parseInt encounters a character that is not a numeral in the specified radix, it ignores it and all succeeding characters and returns the integer value parsed up to that point. source: developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/…Rainie
NaN is reserved for returns where the number being parsed is not a number in any radix.Chatterjee
@MattBurland yes, I know that. The question was: why :)Dinghy
@FrédéricHamidi yeah, that's not really the answer for me. I realize, that people writing the specification are much smarter than me and that's why I want to know: why did they decide it should behave this way.Dinghy
@MichalLeszczyk: And the answer is because that's what the standard says.Rainie
@JGrice: No, NaN is if it can't be parsed in the radix specified. For example parseInt('9',8); is NaN.Rainie
why did they decide it should behave this way. since we can't read minds, that makes this question opinion based.Rainie
Read the spec From ecma-international.org/ecma-262/5.1/#sec-15.1.2.2 The tells you step by step what happens.Adieu
@Michal, I don't know of any public rationale regarding this choice. It may simply be "because strtol() behaves this way".Auscultation
@FrédéricHamidi and that is something that makes sense for me. Since they wanted it to be based on a function that is a native one in C/C++ (since most compilers are written in these languages), they decided it should behave exactly the same. I did not think of that. Thank you :) I'd like to mark this as an answer, if it wasn't a comment. And since some here mentioned that this is opinion-based - is there a place (on any of StackExchange sites) where this question would "fit" better?Dinghy
@Michal, I don't think it would fit better anywhere on Stack Exchange, since we cannot reliably answer inquiries about the motivations behind design decisions. It is indeed only a matter of opinion and speculation.Auscultation
@FrédéricHamidi - because this use case (parsing stuff like 2$ or 200 cents) is exactly what parseInt was designed for. If you want numeric conversion use Number instead.Marijo
@FrédéricHamidi And unfortunately, my opinion or yours do not matter to the spec Unless of course you happen to be on the committee that writes the spec.Induct
"people writing the specification are much smarter than me" - There are plenty of terrible hiccups in specifications, especially the js one. No need to feel humbleArmour
F
22

Why would anyone want this function to behave like this (return an integer even if it isn't the precise representation of the string passed)?

Because most of the time (by far) you're working with base 10 numbers, and in that case JS can just cast - not parse - the string to a number. (edit: Apparently not just base-10; see update below.)

Since JS is dynamically typed, some strings work just fine as numbers without any work on your part. For instance:

 "21" / 3;   // => 7
 "12.4" / 4; // => 3.1

No need for parseInt there, because "21" and "12.4" are essentially numbers already. If, however the string was "12.4xyz" then you would indeed get NaN when dividing, since that is decidedly not a number and can't be implicitly cast or coerced to one.

You can also explicitly "cast" a string to number with Number(someString). While it too only supports base 10, it will indeed return NaN for invalid strings.

So because JS already has implicit and explicit type casting/conversion/coercion, parseInt's role isn't to be a yet another type casting function.

parseInt's role is instead to be, well, a parsing function. A function that tries its best to make sense of its input, returning what it can. It's for when you have a string you can't just cast because it's not quite perfectly numeric. (And, like JS's basic syntax, it's reminiscent of C, as apsillers' answer explained nicely.)

And since it's a parser, not a casting function, it's got the additional feature of being able to handle other bases than 10.

Now, you might ask why there isn't a strict casting function that handles non-base-10 numbers, and would complain like you want, but... hey, there just isn't. JS's designers just decided that parseInt would suffice, because, again, 0x63 percent of the time, you're dealing with base 10.

Closest you can get to "casting" is probably something horribly hacky like:

var hexString = "dsff66";
var number = eval("0x" + hexString); // attempt to interpret as a hexadecimal literal

which'll throw a SyntaxError because 0xdsff66 isn't a valid hex literal.

Update: As Lekensteyn points out in the comments, JS appears to properly cast 0x-prefixed hexadecimal strings too. I didn't know this, but indeed this seems to work:

1 * "0xd0ff66"; // => 13696870
1 * "0xdsff66"; // => NaN

which makes it the simplest way to cast a hex string to a number - and get NaN if it can't be properly represented.

Same behavior applies to Number(), e.g Number("0xd0ff66") returns an integer, and Number("0xdsff66") returns NaN.

(/update)

Alternatively, you can check the string beforehand and return NaN if needed:

function hexToNumber(string) {
  if( !/^(0x)?[0-9a-f]+$/i.test(string) ) return Number.NaN;
  return parseInt(string, 16);
}
Fencing answered 24/9, 2014 at 16:34 Comment(9)
Ew, eval. By "casting" the string to a number, you can have a similar effect. 1 * "0xdsff66" gives NaN in at least Firefox. Checking whether this confirms to the spec is an exercise for the reader.Selftaught
@Selftaught Yeah, ew. I said it was horrible :) But interesting that 0x-prefixed string are cast too! Didn't know that - works fine here in Chrome too.Fencing
Well, that is a nice explanation :) Thank you both.Dinghy
I'm confused by your email "21" => 7. You've mentioned a divide by 3, why the divide/where did it come from?Stoner
@Stoner It's just an example to illustrate a string being cast to a number. Even though "21" is a string, it's also treated as number when needed - such as when you try to divide it or multiply it. I could also have used "6" * 3; // => 18 as an example, or "9" / 2; // => 4.5.Fencing
This is so roflol: window.alert("6"+"3") vs window.alert("6"*"3"). What did the creators of JS "programming language" smoke? Must be good stuff...Gypsy
@Flambino: Sorry I gotcha - I thought you meant that "21" converted to a number comes out as 7 for some reason! Maybe I need more caffeine?Stoner
@Alexander: Yeah, JavaScript might be confusing sometimes :DDinghy
@Gypsy You can always head over to wtfjs.com if you want a heavy dose of JS/DOM weirdness. There is unfortunately some weird stuff in JS. Thankfully, there's also some really, really neat stuff.Fencing
T
43

parseInt reads input until it encounters an invalid character, and then uses whatever valid input it read prior to that invalid character. Consider:

parseInt("17days", 10);

This will use the input 17 and omit everything after the invalid d.

From the ECMAScript specification:

If [input string] S contains any character that is not a radix-R digit, then let Z [the string to be integer-ified] be the substring of S consisting of all characters before the first such character; otherwise, let Z be S.

In your example, s is an invalid base-16 character, so parseInt uses only the leading d.

As for why this behavior was included: there's no way to know for sure, but this is quite likely an attempt to reproduce the behavior of strtol (string to long) from C's standard library. From the strtol(3) man page:

...the string is converted to a long int value in the obvious manner, stopping at the first character which is not a valid digit in the given base.

This connection is further supported (to some degree) by the fact that both parseInt and strtol are specified to ignore leading whitespace, and they can both accept a leading 0x for hexadecimal values.

Triumph answered 24/9, 2014 at 12:59 Comment(0)
F
22

Why would anyone want this function to behave like this (return an integer even if it isn't the precise representation of the string passed)?

Because most of the time (by far) you're working with base 10 numbers, and in that case JS can just cast - not parse - the string to a number. (edit: Apparently not just base-10; see update below.)

Since JS is dynamically typed, some strings work just fine as numbers without any work on your part. For instance:

 "21" / 3;   // => 7
 "12.4" / 4; // => 3.1

No need for parseInt there, because "21" and "12.4" are essentially numbers already. If, however the string was "12.4xyz" then you would indeed get NaN when dividing, since that is decidedly not a number and can't be implicitly cast or coerced to one.

You can also explicitly "cast" a string to number with Number(someString). While it too only supports base 10, it will indeed return NaN for invalid strings.

So because JS already has implicit and explicit type casting/conversion/coercion, parseInt's role isn't to be a yet another type casting function.

parseInt's role is instead to be, well, a parsing function. A function that tries its best to make sense of its input, returning what it can. It's for when you have a string you can't just cast because it's not quite perfectly numeric. (And, like JS's basic syntax, it's reminiscent of C, as apsillers' answer explained nicely.)

And since it's a parser, not a casting function, it's got the additional feature of being able to handle other bases than 10.

Now, you might ask why there isn't a strict casting function that handles non-base-10 numbers, and would complain like you want, but... hey, there just isn't. JS's designers just decided that parseInt would suffice, because, again, 0x63 percent of the time, you're dealing with base 10.

Closest you can get to "casting" is probably something horribly hacky like:

var hexString = "dsff66";
var number = eval("0x" + hexString); // attempt to interpret as a hexadecimal literal

which'll throw a SyntaxError because 0xdsff66 isn't a valid hex literal.

Update: As Lekensteyn points out in the comments, JS appears to properly cast 0x-prefixed hexadecimal strings too. I didn't know this, but indeed this seems to work:

1 * "0xd0ff66"; // => 13696870
1 * "0xdsff66"; // => NaN

which makes it the simplest way to cast a hex string to a number - and get NaN if it can't be properly represented.

Same behavior applies to Number(), e.g Number("0xd0ff66") returns an integer, and Number("0xdsff66") returns NaN.

(/update)

Alternatively, you can check the string beforehand and return NaN if needed:

function hexToNumber(string) {
  if( !/^(0x)?[0-9a-f]+$/i.test(string) ) return Number.NaN;
  return parseInt(string, 16);
}
Fencing answered 24/9, 2014 at 16:34 Comment(9)
Ew, eval. By "casting" the string to a number, you can have a similar effect. 1 * "0xdsff66" gives NaN in at least Firefox. Checking whether this confirms to the spec is an exercise for the reader.Selftaught
@Selftaught Yeah, ew. I said it was horrible :) But interesting that 0x-prefixed string are cast too! Didn't know that - works fine here in Chrome too.Fencing
Well, that is a nice explanation :) Thank you both.Dinghy
I'm confused by your email "21" => 7. You've mentioned a divide by 3, why the divide/where did it come from?Stoner
@Stoner It's just an example to illustrate a string being cast to a number. Even though "21" is a string, it's also treated as number when needed - such as when you try to divide it or multiply it. I could also have used "6" * 3; // => 18 as an example, or "9" / 2; // => 4.5.Fencing
This is so roflol: window.alert("6"+"3") vs window.alert("6"*"3"). What did the creators of JS "programming language" smoke? Must be good stuff...Gypsy
@Flambino: Sorry I gotcha - I thought you meant that "21" converted to a number comes out as 7 for some reason! Maybe I need more caffeine?Stoner
@Alexander: Yeah, JavaScript might be confusing sometimes :DDinghy
@Gypsy You can always head over to wtfjs.com if you want a heavy dose of JS/DOM weirdness. There is unfortunately some weird stuff in JS. Thankfully, there's also some really, really neat stuff.Fencing
C
13

In this particular case parseInt() interpret letter from "A" to "F" as hexadecimal and parse those to decimal numbers. That means d will return 13.

What parseInt() does

  • parseInt("string", radix) interpret numbers and letters in the string as hexadecimal (it depend on the radix) to number.

  • parseInt() only parse number or letter as hexadecimal from the beginning of the string until invalid character as hexadecimal.

  • If parseInt() can't find any number or letter as hexadecimal at the beginning of the string parseInt() will return NaN.

  • If the radix is not defined, the radix is 10.

  • If the string begin with "0x", the radix is 16.

  • If the radix defined 0, the radix is 10.

  • If the radix is 1, parseInt() return NaN.

  • If the radix is 2, parseInt() only parse "0" and "1".

  • If the radix is 3 , parseInt() only parse "0", "1", and "2". And so on.

  • parseInt() parse "0" to 0 if there is no number follows it as the result and remove 0 if there is number follows it. e.g. "0" return 0 and "01" return 1.

  • If the radix is 11, parseInt() only parse string that begins with number from "0" to "9" and/or letter "A".

  • If the radix is 12, parseInt only parse string that begins with number from "0" to "9" and/or letter "A" and "B", and so on.

  • the maximum radix is 36, it will parse string that begins with number from "0" to "9" and/or letter from "A" to "Z".

  • If the characters interpreted as hexadecimal more than one, every characters will has different value, though those characters are the same character. e.g. parseInt("AA", 11) the first "A" has different value with the second "A".

  • Different radix will return different number though the strings is the same string.

See it in action

document.body.innerHTML = "<b>What parseInt() does</b><br>" + 
                          "parseInt('9') = " + parseInt('9') + "<br>" +
                          "parseInt('0129ABZ', 0) = " + parseInt('0129ABZ', 0) + "<br>" +
                          "parseInt('0', 1) = " + parseInt('0', 1) + "<br>" +
                          "parseInt('0', 2) = " + parseInt('0', 2) + "<br>" +
                          "parseInt('10', 2) = " + parseInt('10', 2) + "<br>" +
                          "parseInt('01', 2) = " + parseInt('01', 2) + "<br>" +
                          "parseInt('1', 2) = " + parseInt('1', 2) + "<br>" +
                          "parseInt('A', 10) = " + parseInt('A', 10) + "<br>" +
                          "parseInt('A', 11) = " + parseInt('A', 11) + "<br>" +
                          "parseInt('Z', 36) = " + parseInt('Z', 36) + "<br><br>" +
                          "<b>The value:</b><br>" +
                          "parseInt('A', 11) = " + parseInt('A', 11) + "<br>" +
                          "parseInt('A', 12) = " + parseInt('A', 12) + "<br>" +
                          "parseInt('A', 13) = " + parseInt('A', 13) + "<br>" +
                          "parseInt('AA', 11) = " + parseInt('AA', 11) + " = 100 + 20" + "<br>" +
                          "parseInt('AA', 12) = " + parseInt('AA', 12) + " = 100 + 30" + "<br>" +
                          "parseInt('AA', 13) = " + parseInt('AA', 13) + " = 100 + 40" + "<br>" +
                          "parseInt('AAA', 11) = " + parseInt('AAA', 11) + " = 1000 + 300 + 30" + "<br>" +
                          "parseInt('AAA', 12) = " + parseInt('AAA', 12) + " = 1000 + 500 + 70" + "<br>" +
                          "parseInt('AAA', 13) = " + parseInt('AAA', 13) + " = 1000 + 700 + 130" + "<br>" +
                          "parseInt('AAA', 14) = " + parseInt('AAA', 14) + " = 1000 + 900 + 210" + "<br>" +
                          "parseInt('AAA', 15) = " + parseInt('AAA', 15) + " = 1000 + 1100 + 310";
Chiro answered 24/9, 2014 at 13:25 Comment(1)
If the radix does not defined, the radix is 10. - even though it's generally true, older browsers may default to radix of 8. See parseInt at MDN.Dinghy
R
6

For radices above 10, the letters of the alphabet indicate numerals greater than 9. For example, for hexadecimal numbers (base 16), A through F are used.

In your string dsff66, d is a hexadecimal character(even though the string is non hex) which fits the radix type and is equivalent to number 13. It stops parsing after that since next character is not hexadecimal hence the result.

Redman answered 24/9, 2014 at 13:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.