std::istream_iterator<> with copy_n() and friends
Asked Answered
O

4

27

The snippet below reads three integers from std::cin; it writes two into numbers and discards the third:

std::vector<int> numbers(2);
copy_n(std::istream_iterator<int>(std::cin), 2, numbers.begin());

I'd expect the code to read exactly two integers from std::cin, but it turns out this is a correct, standard-conforming behaviour. Is this an oversight in the standard? What is the rationale for this behaviour?


From 24.5.1/1 in the C++03 standard:

After it is constructed, and every time ++ is used, the iterator reads and stores a value of T.

So in the code above at the point of call the stream iterator already reads one integer. From that point onward every read by the iterator in the algorithm is a read-ahead, yielding the value cached from the previous read.

The latest draft of the next standard, n3225, doesn't seem to bear any change here (24.6.1/1).

On a related note, 24.5.1.1/2 of the current standard in reference to the istream_iterator(istream_type& s) constructor reads

Effects: Initializes in_stream with s. value may be initialized during construction or the first time it is referenced.

With emphasis on "value may be initialized ..." as opposed to "shall be initialized". This sounds contradicting with 24.5.1/1, but maybe that deserves a question of its own.

Oldworld answered 22/2, 2011 at 4:49 Comment(0)
M
12

Unfortunately the implementer of copy_n has failed to account for the read ahead in the copy loop. The Visual C++ implementation works as you expect on both stringstream and std::cin. I also checked the case from the original example where the istream_iterator is constructed in line.

Here is the key piece of code from the STL implementation.

template<class _InIt,
    class _Diff,
    class _OutIt> inline
    _OutIt _Copy_n(_InIt _First, _Diff _Count,
        _OutIt _Dest, input_iterator_tag)
    {   // copy [_First, _First + _Count) to [_Dest, ...), arbitrary input
    *_Dest = *_First;   // 0 < _Count has been guaranteed
    while (0 < --_Count)
        *++_Dest = *++_First;
    return (++_Dest);
    }

Here is the test code

#include <iostream>
#include <istream>
#include <sstream>
#include <vector>
#include <iterator>

int _tmain(int argc, _TCHAR* argv[])
{
    std::stringstream ss;
    ss << 1 << ' ' << 2 << ' ' << 3 << ' ' << 4 << std::endl;
    ss.seekg(0);
    std::vector<int> numbers(2);
    std::istream_iterator<int> ii(ss);
    std::cout << *ii << std::endl;  // shows that read ahead happened.
    std::copy_n(ii, 2, numbers.begin());
    int i = 0;
    ss >> i;
    std::cout << numbers[0] << ' ' << numbers[1] << ' ' << i << std::endl;

    std::istream_iterator<int> ii2(std::cin);
    std::cout << *ii2 << std::endl;  // shows that read ahead happened.
    std::copy_n(ii2, 2, numbers.begin());
    std::cin >> i;
    std::cout << numbers[0] << ' ' << numbers[1] << ' ' << i << std::endl;

    return 0;
}


/* Output
1
1 2 3
4 5 6
4
4 5 6
*/
Minda answered 26/2, 2011 at 23:31 Comment(2)
+1 Thanks! I was positive the issue is only in std::istream_iterator, and that's where it should be fixed. I wonder, though, if this change to algorithms just to account for a peculiar behaviour std::istream_iterator introduces complicates things unnecessarily. I mean, now when we write _n() algorithms with input iterators we always have to account for read-ahead. It's something important to remember, if it can't be solved in std::istream_iterator.Oldworld
um. I think problem is not in istream_iterator but rather it is in the implementation of copy_n in the STL you're using. That's why I provided the code for copy_n so you could check yours.Minda
T
5

Today I encountered very similar problem, and here is the example:

#include <iostream>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <string>

struct A
{
    float a[3];
    unsigned short int b[6];
};

void ParseLine( const std::string & line, A & a )
{
    std::stringstream ss( line );

    std::copy_n( std::istream_iterator<float>( ss ), 3, a.a );
    std::copy_n( std::istream_iterator<unsigned short int>( ss ), 6, a.b );
}

void PrintValues( const A & a )
{
    for ( int i =0;i<3;++i)
    {
        std::cout<<a.a[i]<<std::endl;
    }
    for ( int i =0;i<6;++i)
    {
        std::cout<<a.b[i]<<std::endl;
    }
}

int main()
{
    A a;

    const std::string line( "1.1 2.2 3.3  8 7 6 3 2 1" );

    ParseLine( line, a );

    PrintValues( a );
}

Compiling the above example with g++ 4.6.3 produces one:

1.1 2.2 3.3 7 6 3 2 1 1

, and compiling with g++ 4.7.2 produces another result :

1.1 2.2 3.3 8 7 6 3 2 1

The c++11 standard tells this about copy_n :

template<class InputIterator, class Size, class OutputIterator>
OutputIterator copy_n(InputIterator first, Size n, OutputIterator result);

Effects: For each non-negative integer i < n, performs *(result + i) = *(first + i).
Returns: result + n.
Complexity: Exactly n assignments.

As you can see, it is not specified what exactly happens with the iterators, which means it is implementation dependent.

My opinion is that your example should not read the 3rd value, which means this is a small flaw in the standard that they haven't specified the behavior.

Tomboy answered 13/3, 2013 at 12:46 Comment(0)
P
1

I don't know the exact rationale, but as the iterator also has to support operator*(), it will have to cache the values it reads. Allowing the iterator to cache the first value at construction simplifies this. It also helps in detecting end-of-stream when the stream is initially empty.

Perhaps your use case is one the committee didn't consider?

Picaresque answered 22/2, 2011 at 19:3 Comment(12)
Why does the iterator need to cache its pointee? It's an input iterator, so it only needs to commit to single-pass semantics.Oldworld
You are allowed to call operator*() several times between each increment. The return type is also a reference to the value, so it has to stored somewhere.Picaresque
@BoPersson std::istream_iterator is an input iterator, so it only guarantees a single pass. Once the user dereferences an input iterator it's their responsibility to cache it. In other words, for an input iterator i, the expression *i==*i doesn't necessarily hold.Oldworld
I'm really sorry for being annoying here, if I'm asking a question and then contradicting the answers. I might very well be thoroughly wrong with my fundamental understanding. So I figure I lay anything I know and that contradicts an answer, so people correct my misguided understandings. :)Oldworld
I think this may be worth an LWG issue: How many times is copy_n allowed to increment first? n times? Or n-1 times (assuming n is positive)? Here are directions on how to submit an LWG issue: open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#submit_issue .Heterologous
@Oldworld You are not annoying at all, this is rather interesting. :-). To me, single pass means that you can only increment the iterator once, not that you cannot dereference it more than once. A forward iterator is multi-pass, in that you can save a copy and start over, visiting each element multiple times.Picaresque
@Howard: While I appreciate your advice, you will have to accept that the procedure of opening an issue is beyond the ability of many C++ users, especially if English isn't their native language. Mind you, the first hurdle is A on that list, because I your link points behind Alisdair's email address. Even I, knowing Alisdair's, first translated lwgchair@... into to Language Working Group Chair... And while I think I could do it, but I simply don't have the time to labor through the proceedings, digest the feedback, and refine my text until it's acceptable.Coaming
@sbi: If the directions at that link are not sufficient, I volunteer my services to help any interested party submit an issue. I believe you can reach me privately by simply clicking on my name. However if that doesn't help, simply search for "Howard Hinnant". I'm not hard to find. The C++ standardization process is made up of volunteers. And we need more of them. Submitting an issue is an excellent way to contribute.Heterologous
@wilhelmtell: Just fyi, this is the commit log showing your positive impact on libc++: lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20110221/…Heterologous
@HowardHinnant Thanks! :) I was not (am not) sure whether the issue is in std::istream_iterator or the _n() algorithms that take an input iterator. I sketched an n_iterator() that essentially transforms an algorithm with range input to an _n() algorithm. gist.github.com/845950 It's not pretty, but I was just playing with ideas on how to relief _n() algorithms from the responsibility of accounting for std::istream_iterator. At any rate, thanks! :)Oldworld
Iterator adaptors are a powerful tool. n_iterator looks interesting. I just surveyed all the _n algorithms in <algorithm>. Unless I missed one, copy_n is the only one that takes InputIterators. Not positive but that may mean that copy_n is unique in this regard.Heterologous
@Howard Hinnant The GCC people liked your fix too: gcc.gnu.org/bugzilla/show_bug.cgi?id=50119Hedden
E
0

Today, 9 years after you, I fell into the same problem, So following this thread, while playing with the problem noticed this, It seems we can walk the iterator one step for each reading after first time(I mean cin also can't ignore end of line feed automatically, we help it with cin.ignore(), we can help this implementation too I guess):

    #include<bits/stdc++.h>
    using namespace std;

    int main(){

    freopen("input.txt","r",stdin);

    istream_iterator<int> it(cin);

    ostream_iterator<int> cout_it(cout, " ");

    copy_n(it, 5, cout_it);

    cout<<"\nAnd for the rest of the stream\n";

    for(int i=0;i<10;i++){

        it++;

        copy_n(it, 1, cout_it);

      }

    return 0;
   }

and that should produce output like:

1 2 3 4 5
And for the rest of the stream
6 7 8 9 10 11 12 13 14 15
Edmon answered 13/4, 2020 at 14:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.