C strtok() split string into tokens but keep old data unaltered
Asked Answered
J

4

6

I have the following code:

#include <stdio.h>
#include <string.h>

int main (void) {
    char str[] = "John|Doe|Melbourne|6270|AU";

    char fname[32], lname[32], city[32], zip[32], country[32];
    char *oldstr = str;

    strcpy(fname, strtok(str, "|"));
    strcpy(lname, strtok(NULL, "|"));
    strcpy(city, strtok(NULL, "|"));
    strcpy(zip, strtok(NULL, "|"));
    strcpy(country, strtok(NULL, "|"));

    printf("Firstname: %s\n", fname);
    printf("Lastname: %s\n", lname);
    printf("City: %s\n", city);
    printf("Zip: %s\n", zip);
    printf("Country: %s\n", country);
    printf("STR: %s\n", str);
    printf("OLDSTR: %s\n", oldstr);

    return 0;
}

Execution output:

$ ./str
Firstname: John
Lastname: Doe
City: Melbourne
Zip: 6270
Country: AU
STR: John
OLDSTR: John

Why can't I keep the old data nor in the str or oldstr, what am I doing wrong and how can I not alter the data or keep it?

Jarrell answered 14/6, 2013 at 9:8 Comment(4)
xtmtrx here in my answer I written a code that shows how strtok() works (it modify string in same address space), I think you should have a look:Sayce
So here is the source code of strkok()Sayce
Either make a copy of str before you call strtok or don't use strtok and use a pair of pointers to bracket and copy each token, or a combination of strcspn and strspn to do the same thing. With either of the other methods you can tokenize a string-literal because the original isn't modified, but strtok modifies the original by replacing the separator with nul-characters.Acetylide
Does this answer your question? Why is strtok changing its input like this?Bitterling
S
33

when you do strtok(NULL, "|") strtok() find token and put null on place (replace token with \0) and modify string.

you str, becomes:

char str[] = John0Doe0Melbourne062700AU;
                 
  Str array in memory 
+------------------------------------------------------------------------------------------------+
|'J'|'o'|'h'|'n'|0|'D'|'o'|'e'|0|'M'|'e'|'l'|'b'|'o'|'u'|'r'|'n'|'e'|0|'6'|'2'|'7'|'0'|0|'A'|'U'|0|
+------------------------------------------------------------------------------------------------+
                 ^  replace | with \0  (ASCII value is 0)

Consider the diagram is important because char '0' and 0 are diffident (in string 6270 are char in figure parenthesised by ' where for \0 0 is as number)

when you print str using %s it print chars upto first \0 that is John

To keep your original str unchanged you should fist copy str into some tempstr variable and then use that tempstr string in strtok():

char str[] = "John|Doe|Melbourne|6270|AU";
char* tempstr = calloc(strlen(str)+1, sizeof(char));
strcpy(tempstr, str);

Now use this tempstr string in place of str in your code.

Sayce answered 14/6, 2013 at 9:11 Comment(2)
A well compiled answer +1 :)Wendelina
You could replace calloc + strcpy with a simple strdup.Crapulent
O
3

Because oldstr is just a pointer, an assignment will not make a new copy of your string.

Copy it before passing str to the strtok:

          char *oldstr=malloc(sizeof(str));
          strcpy(oldstr,str);

Your corrected version:

#include <stdio.h>
#include <string.h>
#include<malloc.h>
int main (void) {

   char str[] = "John|Doe|Melbourne|6270|AU";
   char fname[32], lname[32], city[32], zip[32], country[32];
   char *oldstr = malloc(sizeof(str));
   strcpy(oldstr,str);

    ...................
    free(oldstr);
return 0;
}

EDIT:

As @CodeClown mentioned, in your case, it's better to use strncpy. And instead of fixing the sizes of fname etc before hand, you can have pointers in their place and allocate the memory as is required not more and not less. That way you can avoid writing to the buffer out of bounds......

Another Idea: would be to assign the result of strtok to pointers *fname, *lname, etc.. instead of arrays. It seems the strtok is designed to be used that way after seeing the accepted answer.

Caution:In this way, if you change str further that would be reflected in fname,lname also. Because, they just point to str data but not to new memory blocks. So, use oldstr for other manipulations.

#include <stdio.h>
#include <string.h>
#include<malloc.h>
int main (void) {

    char str[] = "John|Doe|Melbourne|6270|AU";
    char *fname, *lname, *city, *zip, *country;
    char *oldstr = malloc(sizeof(str));
    strcpy(oldstr,str);
    fname=strtok(str,"|");
    lname=strtok(NULL,"|");
    city=strtok(NULL, "|");
    zip=strtok(NULL, "|");
    country=strtok(NULL, "|");

    printf("Firstname: %s\n", fname);
    printf("Lastname: %s\n", lname);
    printf("City: %s\n", city);
    printf("Zip: %s\n", zip);
    printf("Country: %s\n", country);
    printf("STR: %s\n", str);
    printf("OLDSTR: %s\n", oldstr);
    free(oldstr);
return 0;
}
Ohara answered 14/6, 2013 at 9:12 Comment(5)
Good answer! Maybe use strncpy() and do not use static buffers as some people or cities have names longer than 32 characters. His next question will be about stack corruption :-)Scab
In a real program - rather than just something lashed up to ask a question about - don't forget to free the memory used by oldstr when you've done.Builtup
@Code Clown, yes, stack corruption :P solved with Grijesh Chauhan answerJarrell
Yes, I do free(str) and free(oldstr)Jarrell
@Ohara here is not, but in the working code my str is dynamically allocated (Curl data from web).Jarrell
S
1

strtok requires an writeable input string and it modifies the input string. If you want to keep the input string you have to a make a copy of it first.

For example:

char str[] = "John|Doe|Melbourne|6270|AU";
char oldstr[32];

strcpy(oldstr, str);  // Use strncpy if you don't know
                      // the size of str
Solace answered 14/6, 2013 at 9:10 Comment(1)
Better get the length of the string first and create a fitting data as people have different names and live in different cities.Scab
S
0

You just copy the pointer to the string, but not the string itself. Use strncpy() to create a copy.

char *oldstr = str; // just copy of the address not the string itself!
Scab answered 14/6, 2013 at 9:9 Comment(3)
this does not prevent strtok() from changing the contents of the string.Encrinite
@EugeneBujak True, but only in the copy. The original stays unchanged.Scab
So why is the use of strcpy() not included in your answer?Cathay

© 2022 - 2024 — McMap. All rights reserved.