Are strtol, strtod unsafe?
Asked Answered
C

4

18

It seems that strtol() and strtod() effectively allow (and force) you to cast away constness in a string:

#include <stdlib.h>
#include <stdio.h>

int main() {
  const char *foo = "Hello, world!";
  char *bar;
  strtol(foo, &bar, 10); // or strtod(foo, &bar);
  printf("%d\n", foo == bar); // prints "1"! they're equal
  *bar = 'X'; // segmentation fault
  return 0;
}

Above, I did not perform any casts myself. However, strtol() basically cast my const char * into a char * for me, without any warnings or anything. (In fact, it wouldn't allow you to type bar as a const char *, and so forces the unsafe change in type.) Isn't that really dangerous?

Crispen answered 14/6, 2009 at 20:36 Comment(0)
S
14

I would guess that because the alternative was worse. Suppose the prototype were changed to add const:

long int strtol(const char *nptr, const char **endptr, int base);

Now, suppose we want to parse a non-constant string:

char str[] = "12345xyz";  // non-const
char *endptr;
lont result = strtol(str, &endptr, 10);
*endptr = '_';
printf("%s\n", str);  // expected output: 12345_yz

But what happens when we try to compile this code? A compiler error! It's rather non-intuitive, but you can't implicitly convert a char ** to a const char **. See the C++ FAQ Lite for a detailed explanation of why. It's technically talking about C++ there, but the arguments are equally valid for C. In C/C++, you're only allowed to implicitly convert from "pointer to type" to "pointer to const type" at the highest level: the conversion you can perform is from char ** to char * const *, or equivalently from "pointer to (pointer to char)" to "pointer to (const pointer to char)".

Since I would guess that parsing a non-constant string is far more likely than parsing a constant string, I would go on to postulate that const-incorrectness for the unlikely case is preferable to making the common case a compiler error.

Stunning answered 14/6, 2009 at 21:2 Comment(2)
But C++ doesn't prevent you from overloading the function: you could have long int strtol(char *nptr, char **endptr, int base); and long int strtol(const char *nptr, const char **endptr, int base);: this fixes your compile error. Indeed, the standard does this for other such functions, like strchr and strstr,Calvo
You could refer to the C FAQ web site What's the difference between const char *p, char const *p, and char * const p? and, more particularly, Why can't I pass a char ** to a function which expects a const char **? instead of the C++ FAQ, though I'm not wholly convinced the explanations are easy to understand.Overland
P
7

Yes, and other functions have the same "const-laundering" issue (for instance strchr, strstr, all that lot).

For precisely this reason C++ adds overloads (21.4:4): the function signature strchr(const char*, int) is replaced by the two declarations:

const char* strchr(const char* s, int c);
      char* strchr(      char* s, int c);

But of course in C you can't have both const-correct versions with the same name, so you get the const-incorrect compromise.

C++ doesn't mention similar overloads for strtol and strtod, and indeed my compiler (GCC) doesn't have them. I don't know why not: the fact that you can't implicitly cast char** to const char** (together with the absence of overloading) explains it for C, but I don't quite see what would be wrong with a C++ overload:

long strtol(const char*, const char**, int);
Polymeric answered 15/6, 2009 at 0:53 Comment(1)
stlport provides this overload (and some others).Annelieseannelise
O
1

The 'const char *' for the first argument means that strtol() won't modify the string.

What you do with the returned pointer is your business.

Yes, it could be regarded as a type safety violation; C++ would probably do things differently (though, as far as I can tell, ISO/IEC 14882:1998 defines <cstdlib> with the same signature as in C).

Overland answered 14/6, 2009 at 20:41 Comment(1)
C++ does define strtol (and everything else in cstdlib) with the same signature as C, but not everything in cstring and cwchar.Polymeric
P
1

I have a compiler that provides, when compiling in C++ mode:

extern "C" {
long int strtol(const char *nptr, const char **endptr, int base);
long int strtol(char *nptr, char **endptr, int base);
}

Obviously these both resolve to the same link-time symbol.

EDIT: according to the C++ standard, this header should not compile. I'm guessing the compiler simply didn't check for this. The definitions did in fact appear as this in the system header files.

Pedanticism answered 15/6, 2009 at 1:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.