The role of std::ws (whitespace) when reading data
Asked Answered
P

3

7

Data saved in my file is (white spaces added at both beginning and end on purpose for this test):

1 2 3

Loading the data using the code below with or without "std::ws" does not cause any difference. So I am confused by the role of "std::ws" as I have seen code using it. Can someone explain a little bit? Thanks!

void main ()
{
ifstream inf; 
inf.open ("test.txt"); 

double x=0, y=0, z=0;
string line;

getline(inf, line);
istringstream iss(line);
//Using "std::ws" here does NOT cause any difference
if (!(iss >> std::ws >> x >> y >> z >> std::ws))
{
    cout << "Format error in the line" << endl;
}
else
{
    cout << x << y << z << endl;
}
iss.str(std::string ());
iss.clear();

cin.get();

}
Proto answered 3/9, 2015 at 0:37 Comment(3)
Whitespace is removed in the beginning of the extraction by default. If you have std::ios_base::noskipws on or are using unformatted input then std::ws can be useful.Liebermann
@0x499602D2 The flag is called skipws; noskipws is a manipulator function.Armitage
@0x499602D2: the primary use isn't so much when std::ios_base::skipws is disabled but rather with unformatted input which doesn't skip leading whitespace.Hartsell
B
12

The primary use of std::ws is when switching between formatted and unformatted input:

  • formatted input, i.e., the usual input operators using `in >> value, skip leading whitespace and stop whenever the format is filled
  • unformatted input, e.g., std::getline(in, value) does not skip leading whitespace

For example, when reading an age and a fullname you might be tempted to read it like this:

 int         age(0);
 std::string fullname;
 if (std::cin >> age && std::getline(std::cin, fullname)) { // BEWARE: this is NOT a Good Idea!
     std::cout << "age=" << age << "  fullname='" << fullname << "'\n";
 }

However, if I'd enter this information using

47
Dietmar Kühl

It would print something like this

age=47 fullname=''

The problem is that the newline following the 47 is still present and immediately fills the std::getine() request. As a result you'd rather use this statement to read the data

if (std::cin >> age && std::getline(std::cin >> std::ws, fullname)) {
    ...
}

The use of std::cin >> std::ws skips the whitespace, in particular the newline, and carries on reading where the actual content is entered.

Bezoar answered 3/9, 2015 at 0:56 Comment(7)
That doesn't look like valid input. Fields should be separated by a known character, either newline or tab, and you go to the next field using ignore.Armitage
@Potatoswatter: what isn't valid input? The code and input as is are working entirely as expected. If it makes you more comfortable assume the if() condition is written as if (std::cout << "Enter your age: " && std::cin >> age && std::cout << "Enter your full name: " && std::getine(std::cin >> std::ws, fullname)) { ... }Hartsell
I mean, it's a strange program that doesn't care whether an input item appears on a new line or not. Essentially you're using >> ws to be sure that newline is treated the same as tab or space. Seems suspect to me.Armitage
If the input is 47 Dietmar Kühl, then >> ws is still needed to remove the leading space. That might be a clearer example.Armitage
@Potatoswatter: I'd consider it more suspect to skip characters otherwise! For example, if you feel you'd use std::cin.ignore() I would start off entering " \n" and if you then ignore() until the end of the line I'd enter "47 Dietmar Kühl" next! Skipping only space, including admittedly, as many newlines as are entered prior to non-whitespace seems a better option when data is potentially entered manually. Of course, if a file is formatted in a specific way, e.g., using JSON or CSV, none of these approaches apply.Hartsell
@Potatoswatter: also, formatted input also considers spaces, tabs, and newline as whitespace, i.e., you can enter as may spaces and newline as you want before, e.g., an int is read (assuming std::ios_base::skipws is set).Hartsell
Yes, I was thinking more along the lines of CSV. There's a continuum from flexible command-line inputs to strict network message formats. But, trying to accommodate "tricky" users is a slippery slope. BTW, under skipws, your example " \n47 Dietmar Kühl" is tolerated. Usually, though, a line-oriented program uses getline to extract, re-buffer, and validate that newlines weren't used as ordinary "field separators."Armitage
A
2

By default, the stream's skipws bit is set and whitespace is automatically skipped before each input. If you unset it with iss >> std::noskipws, then you'll need iss >> std::ws later.

There are also times when whitespace is not automatically skipped. For example, to detect the end of the input, you can use if ( ( iss >> std::ws ).eof() ).

Armitage answered 3/9, 2015 at 0:40 Comment(1)
Thank you! This is very helpful!Proto
N
2

skipws and noskipws are sticky but ws is NOT sticky, so if you want to skip whitespaces with ws you must use it before every operator>>.

Also note that skipws and noskipws only apply to formatted input operation performed with operator>> on the stream. but ws applies to both formatted input operation ( using operator>> ) and unformatted input operation (e.g. get, put, putback , .... )

#include <iostream>
#include <sstream>
#include <string>
using namespace std;

int main ()
{
    char c;
    istringstream input( "     1test      2test       " );

    input >> skipws;
    c = input.peek();
    //skipws doesn't skip whitespaces in unformatted input
    cout << "After using skipws c = " << c << endl;

    input >> ws;
    c = input.peek();
    cout << "After using ws c = " <<  c << endl;
}

Output:

After using skipws c =  
After using ws c = 1
Nucleon answered 27/2, 2018 at 15:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.