How to extract mixed format using istringstream
Asked Answered
S

4

5

Why does my program not output:

10
1.546
,Apple 1

instead of

10
1
<empty space>

here's my program:

#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main () {
    string str = "10,1.546,Apple 1";
    istringstream stream (str);
    int a;
    double b;
    string c, dummy;
    stream >> a >> dummy >> b >> dummy >> c;
    cout << a << endl;
    cout << b << endl;
    cout << c << endl;
    return 0;
}

Basically I am trying to parse the comma-separated strings, any smoother way to do this would help me immense.

Ski answered 16/2, 2014 at 16:52 Comment(4)
char dummy will fix it (the second is eating up the input)Fredricfredrick
@DieterLücking string dummy; d'oh. I was staring at the code like an idiot and didn't see it :)Brahui
@DieterLücking yes it has improved to outputting 10 and 1.546 but where I need Apple 1, and was getting nothing, I am now getting Apple but still not Apple 1. Any Ideas?Ski
@SunilKundal Extraction stops at the space between Apple and 1. You need to use std::getline() (after clearing the newline of course).Postremogeniture
P
4

In IOStreams, strings (meaning both C-strings and C++ strings) have virtually no formatting requirements. Any and all characters are extracted into a string only until a whitespace character is found, or until the end of the stream is caught. In your example, you're using a string intended to eat up the commas between the important data, but the output you are experiencing is the result of the behavior I just explained: The dummy string doesn't just eat the comma, but also the rest of the character sequence until the next whitespace character.

To avoid this you can use a char for the dummy variable, which only has space for one character. And if you're looking to put Apple 1 into a string you will need an unformatted extraction because the formatted extractor operator>>() only reads until whitespace. The appropriate function to use here is std::getline():

string c;
char dummy;

if ((stream >> a >> dummy >> b >> dummy) &&
     std::getline(stream >> std::ws, s))
//   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
{

}

Clearing the newline after the formatted extraction is also necessary which is why I used std::ws to clear leading whitespace. I'm also using an if statement to contain the extraction in order to tell if it succeeded or not.


Any smoother way to do this would help me immensely.

You can set the classification of the comma character to a whitespace character using the std::ctype<char> facet of the locale imbued in the stream. This will make the use of a dummy variable unnecessary. Here's an example:

namespace detail
{
    enum options { add, remove };

    class ctype : public std::ctype<char>
    {
    private:
        static mask* get_table(const std::string& ws, options opt)
        {
            static std::vector<mask> table(classic_table(),
                                           classic_table() + table_size);
            for (char c : ws)
            {
                if (opt == add)
                    table[c] |= space;
                else if (opt == remove)
                    table[c] &= ~space;
            }
            return &table[0];
        }
    public:
        ctype(const std::string& ws, options opt)
            : std::ctype<char>(get_table(ws, opt)) { }
    };
}

class adjustws_impl
{
public:
    adjustws_impl(const std::string& ws, detail::options opt) :
        m_ws(ws),
        m_opt(opt)
    { }

    friend std::istream& operator>>(std::istream& is,
                                    const adjustws_impl& manip)
    {
        const detail::ctype* facet(new detail::ctype(manip.m_ws, manip.m_opt));

        if (!std::has_facet<detail::ctype>(is.getloc())
        {
            is.imbue(std::locale(is.getloc(), facet));
        } else
            delete facet;

        return is;
    }
private:
    std::string m_ws;
    detail::options m_opt;
};

adjustws_impl setws(const std::string& ws)
{
    return adjustws_impl(ws, detail::add);
}

adjustws_impl unsetws(const std::string& ws)
{
    return adjustws_impl(ws, detail::remove);
}

int main()
{
    std::istringstream iss("10,1.546,Apple 1");
    int a; double b; std::string c;

    iss >> setws(","); // set comma to a whitespace character

    if ((iss >> a >> b) && std::getline(iss >> std::ws, c))
    {
        // ...
    }

    iss >> unsetws(","); // remove the whitespace classification
} 
Postremogeniture answered 16/2, 2014 at 17:29 Comment(0)
P
8

Allow me to suggest the following.

I don't consider it 'smoother', as cin / cout dialogue is not 'smooth', imho.

But I think this might be closer to what you want.

 int main (int, char**)
 {
    // always initialize your variables 
    // to value you would not expect from input        
    int            a = -99;
    double         b = 0.0;
    std::string    c("");
    char comma1 = 'Z';
    char comma2 = 'z';

    std::string str = "10,1.546,Apple 1";
    std::istringstream ss(str);

    ss >> a >> comma1 >> b >> comma2;

    // the last parameter has the default delimiter in it
    (void)getline(ss, c, '\n');  // to get past this default delimiter, 
                                 // specify a different delimiter

    std::cout << std::endl;
    std::cout << a << "   '" << comma1 <<  "'   " << std::endl;
    std::cout << b << "   '" << comma2 <<  "'   " << std::endl;
    std::cout << c << std::endl;

    return 0;
 }

Results: (and, of course, you need not do anything with the commas.)

10 ','
1.546 ','
Apple 1

Percussion answered 16/2, 2014 at 23:12 Comment(1)
I like this approach .. and is simpler. Thankyou Douglas.Ski
P
4

In IOStreams, strings (meaning both C-strings and C++ strings) have virtually no formatting requirements. Any and all characters are extracted into a string only until a whitespace character is found, or until the end of the stream is caught. In your example, you're using a string intended to eat up the commas between the important data, but the output you are experiencing is the result of the behavior I just explained: The dummy string doesn't just eat the comma, but also the rest of the character sequence until the next whitespace character.

To avoid this you can use a char for the dummy variable, which only has space for one character. And if you're looking to put Apple 1 into a string you will need an unformatted extraction because the formatted extractor operator>>() only reads until whitespace. The appropriate function to use here is std::getline():

string c;
char dummy;

if ((stream >> a >> dummy >> b >> dummy) &&
     std::getline(stream >> std::ws, s))
//   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
{

}

Clearing the newline after the formatted extraction is also necessary which is why I used std::ws to clear leading whitespace. I'm also using an if statement to contain the extraction in order to tell if it succeeded or not.


Any smoother way to do this would help me immensely.

You can set the classification of the comma character to a whitespace character using the std::ctype<char> facet of the locale imbued in the stream. This will make the use of a dummy variable unnecessary. Here's an example:

namespace detail
{
    enum options { add, remove };

    class ctype : public std::ctype<char>
    {
    private:
        static mask* get_table(const std::string& ws, options opt)
        {
            static std::vector<mask> table(classic_table(),
                                           classic_table() + table_size);
            for (char c : ws)
            {
                if (opt == add)
                    table[c] |= space;
                else if (opt == remove)
                    table[c] &= ~space;
            }
            return &table[0];
        }
    public:
        ctype(const std::string& ws, options opt)
            : std::ctype<char>(get_table(ws, opt)) { }
    };
}

class adjustws_impl
{
public:
    adjustws_impl(const std::string& ws, detail::options opt) :
        m_ws(ws),
        m_opt(opt)
    { }

    friend std::istream& operator>>(std::istream& is,
                                    const adjustws_impl& manip)
    {
        const detail::ctype* facet(new detail::ctype(manip.m_ws, manip.m_opt));

        if (!std::has_facet<detail::ctype>(is.getloc())
        {
            is.imbue(std::locale(is.getloc(), facet));
        } else
            delete facet;

        return is;
    }
private:
    std::string m_ws;
    detail::options m_opt;
};

adjustws_impl setws(const std::string& ws)
{
    return adjustws_impl(ws, detail::add);
}

adjustws_impl unsetws(const std::string& ws)
{
    return adjustws_impl(ws, detail::remove);
}

int main()
{
    std::istringstream iss("10,1.546,Apple 1");
    int a; double b; std::string c;

    iss >> setws(","); // set comma to a whitespace character

    if ((iss >> a >> b) && std::getline(iss >> std::ws, c))
    {
        // ...
    }

    iss >> unsetws(","); // remove the whitespace classification
} 
Postremogeniture answered 16/2, 2014 at 17:29 Comment(0)
B
0

You should do the below changes:

string str = "10  1.546 Apple 1";

And

 stream >> a >> b >> dummy >> c;

In your example, dummy would have got the string ",1.546,Apple" . Because till a non-numeric char is encountered, it is fed to variable a. After that everything is added to dummy ( a string ) until the default delimiter (space) is reached

Bunche answered 16/2, 2014 at 17:11 Comment(1)
You ought to explain why.Brahui
S
0

I could manage to change my code a little. Didn't implement 0x499602D2 method yet, but here is what worked for me.

#include <iostream>
#include <string>
#include <cstdlib>
#include <sstream>

using namespace std;

int main () {
    string str = "10,1.546,Apple 1";
    istringstream stream (str);
    int a;
    double b;
    string c;
    string token;
    while (getline (stream, token, ',')) {
        if (token.find (".") == string::npos && token.find (" ") == string::npos) {
            a = atoi (token.c_str ());
        } else if (token.find (".") != string::npos) {
            b = atof (token.c_str ());
        } else {
            c = string (token);
        }
    }
    cout << a << endl;
    cout << b << endl;
    cout << c << endl;
    return 0;
}
Ski answered 16/2, 2014 at 17:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.