Separating a string in C++
Asked Answered
B

5

13

I am trying to separate a string into multiple strings, to make a customized terminal. So far I have been separating control signals using strtok, however I do not understand how to separate specific instances of a character. For example:

string input = "false || echo \"hello world\" | grep hello";

When trying to strtok this input and trying to separate using | the output would be:

false , echo "hello world" , grep hello

Instead, I would like the output to be:

false || echo "hello world" , grep hello

How can I have strtok treat | and || differently rather than having it saying they are the same?

Bren answered 10/5, 2015 at 7:31 Comment(1)
"How can I have strtok treat | and || differently rather than having it saying they are the same?" -- This happens because strtok considers each character in the second argument to be a delimeter. Also, it does not return an empty string. Related 1,2January
A
8
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
using namespace std;

vector<string> split(string sentence,char delim)
{
    string tempSentence = "";
    tempSentence += delim;
    tempSentence += sentence;
    tempSentence += delim;

     string token;
     vector<string> tokens;
    for (int i=1;i<tempSentence.length()-1;++i)
    {
        if (tempSentence[i] == delim && tempSentence[i-1] != delim && tempSentence[i+1] != delim)
        {
            if (token.length()) tokens.push_back(token);
            token.clear();
        }
        else
        {
            token += tempSentence[i];
        }
    }
    if (token.length()) tokens.push_back(token);

    return tokens;
}

int main() {
    string sentence = "false || echo \"hello world\" | grep hello";
    char delim='|';

    vector<string> tokens = split(sentence,delim);


    for_each(tokens.begin(), tokens.end(), [&](string t) {   
        cout << t << endl;
    });

}

ugly and long! but works!

Annabellannabella answered 10/5, 2015 at 8:7 Comment(2)
Question how would you change the code in the case the user wanted to separate the string using || instead because using your code it would not work since the char delim would have to just be one character. Also thank you because it works perfectly if you are looking for just a single delim.Bren
That's easily fixed, just replace it with a string. However, using one of multiple possible delimiters is a feature that is not so easy to add.Fascista
A
1

strtok() is going to scan character by character, without regard to characters before and after what it is looking for. If you want a smarter scan, you'll need to implement the additional check yourself.

Since strtok just returns a location within the string where a token is found, you'd have to manually check the first character of the token being returned to see if it is also a '|', and then act accordingly.

A better solution would be to look into the use of a regular expression here. It sounds like the symbol you want to split on is not just a |, but rather a | surrounded by spaces -- ie, you are actually searching and splitting on a three character symbol (space - pipe - space)

Atrium answered 10/5, 2015 at 8:11 Comment(0)
F
1

I'd say that the answer to your question is firstly not to use strtok(), which has a multitude of issues, which are even documented in the manpage (at least on Linux).

Secondly, make sure you have tests. Using test-drived development is a must for these tasks, because here several simple things could interact badly with each other and fixing a bug in one place can cause issues in another.

Further, there are tools (e.g. various YACC-variants and similar generators) that allow you to specify an abstract syntax and then turn this definition into C++ code. I'd suggest these for any non-trivial task.

Lastly, if you're only doing this for fun and learning, writing a loop or a set of functions for extracting various tokens from a string is a good approach.

Fascista answered 10/5, 2015 at 8:18 Comment(0)
B
1
#include <iostream>
#include <string>
#include <algorithm>

using namespace std;

int main() {
    string input = "false || echo \"hello world\" | grep hello";

    string::iterator itr;

    itr = input.begin();

    do {
        itr = search_n(itr, input.end(), 1, '|');

        if (itr < input.end() - 1)
        {
            if (*(itr + 1) == '|')
            {
                itr = itr + 2;
                continue;
            }
        }        

        if (itr < input.end())
        {
                *itr = ',';
                itr ++;
        }

    } while (itr < input.end());

    cout << input << endl;

    return 0;
}
Bemused answered 10/5, 2015 at 8:22 Comment(0)
N
1

A fairly simply and straightforward solution that seems to solve your question.

The std::string::find() searches the string for the first occurrence of the sequence specified by its arguments (in this case the string 'delimiter'). When pos is specified, the search only includes characters at or after position pos.

Edited

    #include <iostream>
    #include <string>
    int main(int argc, char const *argv[])
    {
        std::string s = "false || echo \"hello world\" | grep hello";
        std::string delimiter = "|";

        size_t pos = 0, pos1 = 0, flag = 0;
        std::string token, token1;
        while ((pos = s.find(delimiter)) != std::string::npos) {
            pos1 = s.find(delimiter, pos + delimiter.length());
            while (pos1 == pos+1){
                pos = pos1;
                pos1 = s.find(delimiter, pos + delimiter.length());
                flag = 1;
            }
            if (flag) {
                token = s.substr(0, pos1);
                std::cout << token << std::endl;
                if (pos1 > s.length())
                    exit(0);
                s.erase(0, pos1 + delimiter.length());
            }
            else{
                token = s.substr(0, pos);
                std::cout << token << std::endl;
                s.erase(0, pos + delimiter.length());
            }

        }
        std::cout << s << std::endl;
        return 0;
    }

OUTPUT :

false || echo "hello world"

grep hello

Nonproductive answered 10/5, 2015 at 8:55 Comment(1)
This code does not work in the case of echo "hello world" | grep hello | grep world the output should be expected to be: echo "hello world" grep hello grep world rather it is: echo "hello world" | grep hello grep worldBren

© 2022 - 2024 — McMap. All rights reserved.