Why is strtok changing its input like this?
Asked Answered
M

3

23

Ok, so I understand that strtok modifies its input argument, but in this case, it's collapsing down the input string into only the first token. Why is this happening, and what can I do to fix it? (Please note, I'm not talking about the variable "temp", which should be the first token, but rather the variable "input", which after one call to strtok becomes "this")

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
   char input[]="this is a test of the tokenizor seven";
   char * temp;
   temp=strtok(input," ");
   printf("input: %s\n", input); //input is now just "this"
}
Mozzetta answered 23/2, 2012 at 2:52 Comment(0)
R
40

When strtok() finds a token, it changes the character immediately after the token into a \0, and then returns a pointer to the token. The next time you call it with a NULL argument, it starts looking after the separators that terminated the first token -- i.e., after the \0, and possibly further along.

Now, the original pointer to the beginning of the string still points to the beginning of the string, but the first token is now \0-terminated -- i.e., printf() thinks the end of the token is the end of the string. The rest of the data is still there, but that \0 stops printf() from showing it. If you used a for-loop to walk over the original input string up to the original number of characters, you'd find the data is all still there.

Revelatory answered 23/2, 2012 at 3:0 Comment(6)
Oh I see. My understanding of how strtok works was way off -- I assumed it chomped off the token and then slid the input pointer to the first character after the delimeter. At any rate, thank you! This was a very clear and helpful answer.Mozzetta
But after strtok finishes and return NULL (as there are no more tokens), the initial string is restored? Or in order to safely use the strtok you should do a copy of the source string? Also, what will happen to my original string if I stop the strtok before it finishes?Supernational
@CătălinaSîrbu If you need the original contents of the character buffer to be preserved, then yes, you’d need to make a copy. But in practice that’s rarely the case.Revelatory
I would need one more clarification, i was reading this A very important remark has to be made here: the function modifies the string pointed to by the first argument (it places null characters at the ends of the tokens – but they’ll all be removed after the last invocation). From what I understand, this si wrong, the source string will not be restored after the last invocation of strtok (meaning the invocation that will return NULL). Is it so ?Supernational
@CătălinaSîrbu Yes, that quote (where is it from?) is incorrect. strtok does not restore the original string under any circumstances. If it did, it would invalidate all the tokens it had created, meaning you’d have to copy them for them to be useful — which is not the case.Revelatory
Thank you very much! Now is clear. It is a quote from CLP Advanced programming in C course from cpp institute (they have plenty of mistakes, but it is ok because I always double check and I pay more attention)Supernational
M
5

You should printout the token that you receive from strtok and not worry about the input array because NULLs will be inserted by strtok. You need repeated calls to get all of the tokens:

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
  char input[]="this is a test of the tokenizor seven";
  char * temp;
  temp=strtok(input," ");
  while( temp != NULL ) {
    printf("temp is \"%s\"\n", temp );
    temp = strtok( NULL, " ");
  }
}
Menell answered 23/2, 2012 at 3:1 Comment(1)
As I said above, clearly I had the wrong idea as to how strtok actually tokenized things. Thanks for your help!Mozzetta
A
2

It's because strtok inserts nulls into each separator, which is why you use repeated calls to strtok to get each token. The input string cannot be used once you start using strtok. You don't "fix" it -- this is how it works.

Ats answered 23/2, 2012 at 2:59 Comment(2)
Thanks for such a quick response. Of course when I said "fix it" I meant "how do I get the result I desire," but I appreciate you taking the time to help me.Mozzetta
If you need an unaffected copy of the input string, then you need to make a copy of it before you strtok.Ats

© 2022 - 2024 — McMap. All rights reserved.