strtok - char array versus char pointer [duplicate]
Asked Answered
A

6

16

Possible Duplicate:
strtok wont accept: char *str

When using the strtok function, using a char * instead of a char [] results in a segmentation fault.

This runs properly:

char string[] = "hello world";
char *result = strtok(string, " ");

This causes a segmentation fault:

char *string = "hello world";
char *result = strtok(string, " ");

Can anyone explain what causes this difference in behaviour?

Armet answered 3/11, 2010 at 18:39 Comment(0)
I
36
char string[] = "hello world";

This line initializes string to be a big-enough array of characters (in this case char[12]). It copies those characters into your local array as though you had written out

char string[] = { 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\0' };

The other line:

char* string = "hello world";

does not initialize a local array, it just initializes a local pointer. The compiler is allowed to set it to a pointer to an array which you're not allowed to change, as though the code were

const char literal_string[] = "hello world";
char* string = (char*) literal_string;

The reason C allows this without a cast is mainly to let ancient code continue compiling. You should pretend that the type of a string literal in your source code is const char[], which can convert to const char*, but never convert it to a char*.

Individual answered 3/11, 2010 at 18:48 Comment(1)
Many good answers, but I found this the clearest example of the fundamental issue.Armet
B
16

In the second example:

char *string = "hello world";
char *result = strtok(string, " ");

the pointer string is pointing to a string literal, which cannot be modified (as strtok() would like to do).

You could do something along the lines of:

char *string = strdup("hello world");
char *result = strtok(string, " ");

so that string is pointing to a modifiable copy of the literal.

Breve answered 3/11, 2010 at 18:43 Comment(1)
I'm going to resist the urge to -1, but I really don't like this answer. I think it leads newbie coders to the idiom of throwing around strdup to solve segfaults rather than learning how to manage memory (and especially strings). But I'm not sure what a better answer would be without saying "just use the array or dynamically allocate memory for your string". By the way, strdup is not standard C, but of course it's easy enough to implement on systems which don't have it.Pinto
K
5

strtok modifies the string you pass to it (or tries to anyway). In your first code, you're passing the address of an array that's been initialized to a particular value -- but since it's a normal array of char, modifying it is allowed.

In the second code, you're passing the address of a string literal. Attempting to modify a string literal gives undefined behavior.

Kostroma answered 3/11, 2010 at 18:45 Comment(0)
T
3

In the second case (char *), the string is in read-only memory. The correct type of string constants is const char *, and if you used that type to declare the variable you would get warned by the compiler when you tried to modify it. For historical reasons, you're allowed to use string constants to initialize variables of type char * even though they can't be modified. (Some compilers let you turn this historic license off, e.g. with gcc's -Wwrite-strings.)

Tusche answered 3/11, 2010 at 18:43 Comment(5)
It is also worth mentioning that, in the first case, there is an implicit copy of the string literal into the char array. Which is why you don't have the same issue.Pituri
const char * actually also results in a segfault if attempted to be used with strtok, but at least it gives a compilation warning. But noted about the modification being the issue.Armet
Yeah, I should have been less telegraphic. Answer edited.Tusche
Using const char * should give a compiler error, not a warning, if you pass it to strtok. The C language does not have implicit conversions which remove qualifiers; you need an explicit cast.Pinto
"[construct] discards qualifiers from pointer target type" really is just a warning with gcc 4.2, even with -pedantic. It doesn't appear to be a special case for char * either.Tusche
T
0

The first case creates a (non const) char array that is big enough to hold the string and initializes it with the contents of the string. The second case creates a char pointer and initializes it to point at the string literal, which is probably stored in read only memory.

Since strtok wants to modify the memory pointed at by the argument you pass it, the latter case causes undefined behavior (you're passing in a pointer that points at a (const) string literal), so its unsuprising that it crashes

Triste answered 3/11, 2010 at 18:43 Comment(0)
Z
0

Because the second one declares a pointer (that can change) to a constant string...

So depending on your compiler / platform / OS / memory map... the "hello world" string will be stored as a constant (in an embedded system, it may be stored in ROM) and trying to modify it will cause that error.

Zoroastrianism answered 3/11, 2010 at 18:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.