C++ streams confusion: istreambuf_iterator vs istream_iterator?
Asked Answered
D

2

60

What is the difference between istreambuf_iterator and istream_iterator? And in general what is the difference between streams and streambufs? I really can't find any clear explanation for this so decided to ask here.

Depolymerize answered 12/5, 2012 at 13:6 Comment(1)
see also what-exactly-is-streambuf-how-do-i-use-itTramline
E
63

IOstreams use streambufs to as their source / target of input / output. Effectively, the streambuf-family does all the work regarding IO and the IOstream-family is only used for formatting and to-string / from-string transformation.

Now, istream_iterator takes a template argument that says what the unformatted string-sequence from the streambuf should be formatted as, like istream_iterator<int> will interpret (whitespace-delimited) all incoming text as ints.

On the other hand, istreambuf_iterator only cares about the raw characters and iterates directly over the associated streambuf of the istream that it gets passed.

Generally, if you're only interested in the raw characters, use an istreambuf_iterator. If you're interested in the formatted input, use an istream_iterator.

All of what I said also applies to ostream_iterator and ostreambuf_iterator.

Evangelineevangelism answered 12/5, 2012 at 13:16 Comment(6)
"Generally, if you're only interested in the raw characters, use an istream_iterator" -- should that be istreambuf_iterator?Detrition
Just one minor detail: the iostream family doesn't really do much of the formatting itself either -- that's mostly delegated to the locale associated with the stream. The iostream is basically a "match maker", putting a locale together with a streambuf. Most of the real work is delegated to one or the other though.Osmanli
@Jerry: I didn't want to delve into the strange thing that locales are, since I don't quite understand them myself, so I left it at that.Evangelineevangelism
@Xeo: Much as I'd like to, I surely can't blame you for that!Osmanli
@Jerry: You're free to add the locale thingy to this answer if you'd like, or write up an answer on your own. ;)Evangelineevangelism
@Evangelineevangelism great answer thanks!Acetamide
O
24

Here's a really badly kept secret: an iostream per se, has almost nothing to do with actually reading or writing from/to a file on your computer.

An iostream basically acts as a "matchmaker" between a streambuf and a locale:

enter image description here

The iostream stores some state about how conversions should be done (e.g., the current width and precision for a conversion). It uses those to direct the locale how and where to do a conversion (e.g., convert this number to a string in that buffer with width 8 and precision 5).

Although you didn't ask directly about it, the locale in its turn is really just a container--but (for rather an oddity) a typesafe heterogeneous container. The things it contains are facets. A facet object defines a single facet of an overall locale. The standard defines a number of facets for everything from reading and writing numbers (num_get, num_put) to classifying characters (the ctype facet).

By default, a stream will use the "C" locale. This is pretty basic--numbers are just converted as a stream of digits, the only things it recognizes as letters are the 26 lower case and 26 upper case English letters, and so on. You can, however, imbue a stream with a different locale of your choice. You can choose locales to use by names specified in strings. One that's particularly interesting is one that's selected by an empty string. Using an empty string basically tells the runtime library to select the locale it "thinks" is the most suitable, usually based on how the user has configured the operating system. This allows code to deal with data in a localized format without being written explicitly for any particular locale.

So, the basic difference between an istream_iterator and an istreambuf_iterator is that the data coming out of an istreambuf_iterator hasn't gone through (most of the) transformations done by the locale, but data coming out of an istream_iterator has been transformed by the locale.

For what it's worth, that "most of the" in the previous paragraph is referring to the fact that when you read data from an istreambuf (via an iterator or otherwise) one little bit of locale-based transformation is done: along with various "formatting" kinds of things, the locale contains a codecvt facet, which is what's used to convert from some external representation to some internal representation (e.g., UTF-8 to UTF-32).

It may make more sense to ignore the fact that they're both stored in a locale, and think only of the individual facets involved:

enter image description here

So that's the real difference between a istream_iterator and an istreambuf_iterator. A little bit of transformation is (at least potentially) done to the data from either one, but substantially less is done to the data coming from an istreambuf_iterator.

Osmanli answered 24/12, 2015 at 21:23 Comment(8)
Good explanation. What about using streambufs directly for raw binary data, is that possible?Rucksack
@Pavel: That's not how C++'s iostreams work, but at least in theory, there's no reason they couldn't. I doubt you'd want to though--if you did, you'd have to apply the codecvt conversion one character at a time, as you read the data out of the raw buffer, which I believe would normally lose a fair amount of speed (compared to converting an entire buffer at a time).Osmanli
That's the reason I don't want to use iostreams as I don't want that codecvt involved.Rucksack
@Pavel: You can still use iostreams--you just need to write your own stream buffer that handles underflow (if it's an input stream) and/or overflow (if it's an output stream), and reads/writes raw data, without doing a code conversion. The majority of stream buffers I've written don't do any code conversion.Osmanli
in my case I completely avoid it, I never use them I prefer printf than all that chevron madness and extremely slow performance. We've had multiple times that performance issues were related to iostream use.Rucksack
@Pavel: Fair enough--if you prefer printf, go for it. If you want a stream buffer that stores raw data, you can certainly do that too (just not quite sure why you'd ask about if if you don't want it anyway).Osmanli
"What about using streambufs directly for raw binary data, is that possible?" - that means that I wanted to know if that was possible to use regular iostreams without any involvement of codecvt and without writing any custom stream or anything like that. So that regular formatted output for binary would be at least distantly comparable to printf and not 100 times slower. Note, afaik iostream on linux/gcc do not have these issues as iostreams that come with windows/vsRucksack
@Pavel: Unless you're doing something horribly wrong (e.g., using std::endl where you only needed \n) iostreams are quite competitive with C I/O for speed. https://mcmap.net/q/64579/-mixing-cout-and-printf-for-faster-outputOsmanli

© 2022 - 2024 — McMap. All rights reserved.