Why does parseInt('dsff66',16) return 13?

D

4

27

today I stumbled on a strange (in my opinion) case in JavaScript. I passed a non-hexadecimal string to the parseInt function with the base of 16 and...I got the result. I would expect the function to throw some kind of exception or at least return NaN, but it succeeded parsing it and returned an int.

My call was:

var parsed = parseInt('dsff66', 16); // note the 's' in the first argument
document.write(parsed);

and the result was: 13.

I noticed that it "stops" parsing with the first character that doesn't belong to the numeral system specified in the 2nd argument, so calling parseInt('fg',16) I would get 15 as a result.

In my opinion, it should return NaN. Can anyone explain to me why it doesn't? Why would anyone want this function to behave like this (return an integer even if it isn't the precise representation of the string passed) ?

Dinghy answered 24/9, 2014 at 12:50 Comment(16)

Note that if you do something like parseInt("2foo",10) you get 2. This is as expected. It parses up to the first non-numeric character – Rainie 24/9, 2014 at 12:52

In my opinion, it should return NaN. Can anyone explain to me why it doesn't? Because the spec says it must not. And unfortunately, my opinion or yours do not matter to the spec. – Auscultation 24/9, 2014 at 12:52

If parseInt encounters a character that is not a numeral in the specified radix, it ignores it and all succeeding characters and returns the integer value parsed up to that point.

source: developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/… – Rainie 24/9, 2014 at 12:53

NaN is reserved for returns where the number being parsed is not a number in any radix. – Chatterjee 24/9, 2014 at 12:54

@MattBurland yes, I know that. The question was: why :) – Dinghy 24/9, 2014 at 12:55

@FrédéricHamidi yeah, that's not really the answer for me. I realize, that people writing the specification are much smarter than me and that's why I want to know: why did they decide it should behave this way. – Dinghy 24/9, 2014 at 12:55

@MichalLeszczyk: And the answer is because that's what the standard says. – Rainie 24/9, 2014 at 12:55

@JGrice: No, NaN is if it can't be parsed in the radix specified. For example parseInt('9',8); is NaN. – Rainie 24/9, 2014 at 12:56

why did they decide it should behave this way. since we can't read minds, that makes this question opinion based. – Rainie 24/9, 2014 at 12:56

Read the spec From ecma-international.org/ecma-262/5.1/#sec-15.1.2.2 The tells you step by step what happens. – Adieu 24/9, 2014 at 12:58

@Michal, I don't know of any public rationale regarding this choice. It may simply be "because strtol() behaves this way". – Auscultation 24/9, 2014 at 12:59

@FrédéricHamidi and that is something that makes sense for me. Since they wanted it to be based on a function that is a native one in C/C++ (since most compilers are written in these languages), they decided it should behave exactly the same. I did not think of that. Thank you :) I'd like to mark this as an answer, if it wasn't a comment. And since some here mentioned that this is opinion-based - is there a place (on any of StackExchange sites) where this question would "fit" better? – Dinghy 24/9, 2014 at 13:11

@Michal, I don't think it would fit better anywhere on Stack Exchange, since we cannot reliably answer inquiries about the motivations behind design decisions. It is indeed only a matter of opinion and speculation. – Auscultation 24/9, 2014 at 13:14

@FrédéricHamidi - because this use case (parsing stuff like 2$ or 200 cents) is exactly what parseInt was designed for. If you want numeric conversion use Number instead. – Marijo 24/9, 2014 at 17:49

@FrédéricHamidi And unfortunately, my opinion or yours do not matter to the spec Unless of course you happen to be on the committee that writes the spec. – Induct 25/9, 2014 at 11:56

"people writing the specification are much smarter than me" - There are plenty of terrible hiccups in specifications, especially the js one. No need to feel humble – Armour 18/10, 2023 at 14:46

F

22

Why would anyone want this function to behave like this (return an integer even if it isn't the precise representation of the string passed)?

Because most of the time (by far) you're working with base 10 numbers, and in that case JS can just cast - not parse - the string to a number. (edit: Apparently not just base-10; see update below.)

Since JS is dynamically typed, some strings work just fine as numbers without any work on your part. For instance:

 "21" / 3;   // => 7
 "12.4" / 4; // => 3.1

No need for parseInt there, because "21" and "12.4" are essentially numbers already. If, however the string was "12.4xyz" then you would indeed get NaN when dividing, since that is decidedly not a number and can't be implicitly cast or coerced to one.

You can also explicitly "cast" a string to number with Number(someString). ~~While it too only supports base 10,~~ it will indeed return NaN for invalid strings.

So because JS already has implicit and explicit type casting/conversion/coercion, parseInt's role isn't to be a yet another type casting function.

parseInt's role is instead to be, well, a parsing function. A function that tries its best to make sense of its input, returning what it can. It's for when you have a string you can't just cast because it's not quite perfectly numeric. (And, like JS's basic syntax, it's reminiscent of C, as apsillers' answer explained nicely.)

And since it's a parser, not a casting function, it's got the additional feature of being able to handle other bases than 10.

Now, you might ask why there isn't a strict casting function that handles non-base-10 numbers, and would complain like you want, but... hey, there just isn't. JS's designers just decided that parseInt would suffice, because, again, 0x63 percent of the time, you're dealing with base 10.

~~Closest you can get to "casting" is probably something horribly hacky like:~~

var hexString = "dsff66";
var number = eval("0x" + hexString); // attempt to interpret as a hexadecimal literal

which'll throw a SyntaxError because 0xdsff66 isn't a valid hex literal.

Update: As Lekensteyn points out in the comments, JS appears to properly cast 0x-prefixed hexadecimal strings too. I didn't know this, but indeed this seems to work:

1 * "0xd0ff66"; // => 13696870
1 * "0xdsff66"; // => NaN

which makes it the simplest way to cast a hex string to a number - and get NaN if it can't be properly represented.

Same behavior applies to Number(), e.g Number("0xd0ff66") returns an integer, and Number("0xdsff66") returns NaN.

(/update)

Alternatively, you can check the string beforehand and return NaN if needed:

function hexToNumber(string) {
  if( !/^(0x)?[0-9a-f]+$/i.test(string) ) return Number.NaN;
  return parseInt(string, 16);
}

Fencing answered 24/9, 2014 at 16:34 Comment(9)

Ew, eval. By "casting" the string to a number, you can have a similar effect. 1 * "0xdsff66" gives NaN in at least Firefox. Checking whether this confirms to the spec is an exercise for the reader. – Selftaught 24/9, 2014 at 20:37

@Selftaught Yeah, ew. I said it was horrible :) But interesting that 0x-prefixed string are cast too! Didn't know that - works fine here in Chrome too. – Fencing 24/9, 2014 at 21:3

Well, that is a nice explanation :) Thank you both. – Dinghy 25/9, 2014 at 6:13

I'm confused by your email "21" => 7. You've mentioned a divide by 3, why the divide/where did it come from? – Stoner 25/9, 2014 at 7:54

@Stoner It's just an example to illustrate a string being cast to a number. Even though "21" is a string, it's also treated as number when needed - such as when you try to divide it or multiply it. I could also have used "6" * 3; // => 18 as an example, or "9" / 2; // => 4.5. – Fencing 25/9, 2014 at 8:12

This is so roflol: window.alert("6"+"3") vs window.alert("6"*"3"). What did the creators of JS "programming language" smoke? Must be good stuff... – Gypsy 25/9, 2014 at 9:15

@Flambino: Sorry I gotcha - I thought you meant that "21" converted to a number comes out as 7 for some reason! Maybe I need more caffeine? – Stoner 25/9, 2014 at 9:24

@Alexander: Yeah, JavaScript might be confusing sometimes :D – Dinghy 25/9, 2014 at 9:27

@Gypsy You can always head over to wtfjs.com if you want a heavy dose of JS/DOM weirdness. There is unfortunately some weird stuff in JS. Thankfully, there's also some really, really neat stuff. – Fencing 25/9, 2014 at 9:35

T

43

parseInt reads input until it encounters an invalid character, and then uses whatever valid input it read prior to that invalid character. Consider:

parseInt("17days", 10);

This will use the input 17 and omit everything after the invalid d.

From the ECMAScript specification:

If [input string] S contains any character that is not a radix-R digit, then let Z [the string to be integer-ified] be the substring of S consisting of all characters before the first such character; otherwise, let Z be S.

In your example, s is an invalid base-16 character, so parseInt uses only the leading d.

As for why this behavior was included: there's no way to know for sure, but this is quite likely an attempt to reproduce the behavior of strtol (string to long) from C's standard library. From the strtol(3) man page:

...the string is converted to a long int value in the obvious manner, stopping at the first character which is not a valid digit in the given base.

This connection is further supported (to some degree) by the fact that both parseInt and strtol are specified to ignore leading whitespace, and they can both accept a leading 0x for hexadecimal values.

Triumph answered 24/9, 2014 at 12:59 Comment(0)

F

22