Split string by a substring
Asked Answered
A

6

7

I have following string:

char str[] = "A/USING=B)";

I want to split to get separate A and B values with /USING= as a delimiter

How can I do it? I known strtok() but it just split by one character as delimiter.

Afghani answered 22/1, 2016 at 8:53 Comment(5)
You can use strchr to find the '/', '=' and ')' characters, and get the substrings using e.g. strncpy (or just plain strcpy if you can modify the source string).Holbert
I just known my input string have exactly /USING= is unique , A and B just for example ,may be it include ` or =`Afghani
Then how about strstr to find starting position of the sub-stringHolbert
you can use strstr get position of /USING, and split based on this positionVolcano
@Ryo, I got an answer to your question. This was also asked on another website. My answer uses strtok but still successfully splits it. I have even mentioned where the question was asked earlier. Check it out.Anthropography
F
1

I known strtok() but it just split by one character as delimiter

Nopes, it's not.

As per the man page for strtok(), (emphasis mine)

char *strtok(char *str, const char *delim);

[...] The delim argument specifies a set of bytes that delimit the tokens in the parsed string. [...] A sequence of two or more contiguous delimiter bytes in the parsed string is considered to be a single delimiter. [...]

So, it need not be "one character" as you've mentioned. You can using a string, like in your case "/USING=" as the delimiter to get the job done.

Falsecard answered 22/1, 2016 at 9:25 Comment(5)
Answer in not valid, this will not split "U/USING=N" correctlyMagellan
@Magellan This was more about the usage of strtok() with the shown input. If the input varies, there are other methods, like strstr() which can be utilized.Falsecard
You give the impression that using strtok is a good idea and the OP will assume that it will parse other strings that have /USING= in them. You should at least warn him that it will only work for his example string and not for any arbitrary string containing the substring /USING=Chirp
@JerryJeremiah I missed to see your point. Can you elaborate?Falsecard
strtok will not split based on substrings. Rather it takes in a list of delimiters. For each delimiter found in the string, it'll split it. So no strtok cannot split with a substring like you've mentioned.Jitterbug
T
9

As others have pointed out, you can use strstr from <string.h> to find the delimiter in your string. Then either copy the substrings or modify the input string to split it.

Here's an implementation that returns the second part of a split string. If the string can't be split, it returns NULL and the original string is unchanged. If you need to split the string into more substrings, you can call the function on the tail repeatedly. The first part will be the input string, possibly shortened.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *split(char *str, const char *delim)
{
    char *p = strstr(str, delim);

    if (p == NULL) return NULL;     // delimiter not found

    *p = '\0';                      // terminate string after head
    return p + strlen(delim);       // return tail substring
}

int main(void)
{
    char str[] = "A/USING=B";
    char *tail;

    tail = split(str, "/USING=");

    if (tail) {
        printf("head: '%s'\n", str);
        printf("tail: '%s'\n", tail);
    }

    return 0;
}
Thelma answered 22/1, 2016 at 9:8 Comment(0)
F
1

I known strtok() but it just split by one character as delimiter

Nopes, it's not.

As per the man page for strtok(), (emphasis mine)

char *strtok(char *str, const char *delim);

[...] The delim argument specifies a set of bytes that delimit the tokens in the parsed string. [...] A sequence of two or more contiguous delimiter bytes in the parsed string is considered to be a single delimiter. [...]

So, it need not be "one character" as you've mentioned. You can using a string, like in your case "/USING=" as the delimiter to get the job done.

Falsecard answered 22/1, 2016 at 9:25 Comment(5)
Answer in not valid, this will not split "U/USING=N" correctlyMagellan
@Magellan This was more about the usage of strtok() with the shown input. If the input varies, there are other methods, like strstr() which can be utilized.Falsecard
You give the impression that using strtok is a good idea and the OP will assume that it will parse other strings that have /USING= in them. You should at least warn him that it will only work for his example string and not for any arbitrary string containing the substring /USING=Chirp
@JerryJeremiah I missed to see your point. Can you elaborate?Falsecard
strtok will not split based on substrings. Rather it takes in a list of delimiters. For each delimiter found in the string, it'll split it. So no strtok cannot split with a substring like you've mentioned.Jitterbug
M
1

Here is a little function to do this. It works exactly like strtok_r except that the delimiter is taken as a delimiting string, not a list of delimiting characters.

char *strtokstr_r(char *s, char *delim, char **save_ptr)
{
    char *end;
    if (s == NULL)
        s = *save_ptr;

    if (s == NULL || *s == '\0')
    {
        *save_ptr = s;
        return NULL;
    }

    // Skip leading delimiters.
    while (strstr(s,delim)==s) s+=strlen(delim);
    if (*s == '\0')
    {
        *save_ptr = s;
        return NULL;
    }

    // Find the end of the token.
    end = strstr (s, delim);
    if (end == NULL)
    {
        *save_ptr = s + strlen(s);
        return s;
    }

    // Terminate the token and make *SAVE_PTR point past it.
    memset(end, 0, strlen(delim));
    *save_ptr = end + strlen(delim);
    return s;
}
Magellan answered 22/1, 2020 at 9:54 Comment(0)
L
0

This answer is only valid if the input is this one, if were "abUcd/USING=efgh" your algorithm doesn't work.

This answer is the only valid for me:

char *split(char *str, const char *delim)
{
    char *p = strstr(str, delim);

    if (p == NULL) return NULL;     // delimiter not found

    *p = '\0';                      // terminate string after head
    return p + strlen(delim);       // return tail substring
}

int main(void)
{
    char str[] = "A/USING=B";
    char *tail;

    tail = split(str, "/USING=");

    if (tail) {
        printf("head: '%s'\n", str);
        printf("tail: '%s'\n", tail);
    }

    return 0;
}
Livesay answered 4/5, 2021 at 11:4 Comment(0)
B
0

strtok takes a list of delimiters, so it only splits by each character instead of the whole delimiter string. This means that the delimiter of "___" would split the string at "_", "__", "___", ... and so on because it's using one underscore as the delimiter.

Example:

int main ()
{
  char str[] ="- This, a sample string with-dash-es.";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok (str," ,.---");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.---");
  }
  return 0;
}

Output:

Splitting string "- This, a sample string with-dash-es." into tokens:
This
a
sample
string
with
dash
es

If you want to split by the entire substring, strstr would be a good option like the others mentioned. Below are my two cents for splitting the string multiple times (instead of just once):

int main(void)
{
    char str[] = "This is an-example string---to---be-split.";
    char *start = str;
    char *end;

    end = strstr(start, "---");
    while (end) {
      *end = '\0';
      printf("part: '%s'\n", start);
      start = end + strlen("---");
      end = strstr(start, "---");
    }
    
    printf("last part: '%s'\n", start);

    return 0;
}

Output:

part: 'This is an-example string'
part: 'to'
last part: 'be-split.'
Blida answered 18/1 at 20:2 Comment(0)
A
-1

See this. I got this when I searched for your question on google.

In your case it will be:

#include <stdio.h>
#include <string.h>

int main (int argc, char* argv [])
{
    char theString [16] = "abcd/USING=efgh";
    char theCopy [16];
    char *token;
    strcpy (theCopy, theString);
    token = strtok (theCopy, "/USING=");
    while (token)
    {
        printf ("%s\n", token);
        token = strtok (NULL, "/USING=");
    }

    return 0;
}

This uses /USING= as the delimiter.

The output of this was:

abcd                                                                                                                                                                                                                      
efgh 

If you want to check, you can compile and run it online over here.

Anthropography answered 22/1, 2016 at 9:16 Comment(7)
Sorry , but I just implement for my real input string char str[] = SELECT * FROM ACN WHERE CID=:C1 AND ACCTNAME=:C2/USING=(C1=70,C2='0D100S') but it not work as I expected ,please help me to check it `Afghani
Give me a minute. Let me compile and check it.Anthropography
Can you please tell me the output you are getting.Anthropography
Ok ,sorry for my mistake , but could you help me why my actual string not work ,I prefer about using strtokAfghani
@Ryo, give me some time. Even i am not sure. I'll get back to you.Anthropography
@Ryo, see this. It's accepted answer explains everything.Anthropography
@Ryo, Still if you do not understand, ask me.Anthropography

© 2022 - 2024 — McMap. All rights reserved.