Simpler way to create a C++ memorystream from (char*, size_t), without copying the data?
Asked Answered
B

4

36

I couldn't find anything ready-made, so I came up with:

class membuf : public basic_streambuf<char>
{
public:
  membuf(char* p, size_t n) {
    setg(p, p, p + n);
    setp(p, p + n);
  }
}

Usage:

char *mybuffer;
size_t length;
// ... allocate "mybuffer", put data into it, set "length"

membuf mb(mybuffer, length);
istream reader(&mb);
// use "reader"

I know of stringstream, but it doesn't seem to be able to work with binary data of given length.

Am I inventing my own wheel here?

EDIT

  • It must not copy the input data, just create something that will iterate over the data.
  • It must be portable - at least it should work both under gcc and MSVC.
Boneblack answered 17/1, 2010 at 4:7 Comment(2)
What version of MSVC? >6, I hope. ;)Strychnine
I think your solution is good. :) #1448967Leshalesher
S
33

I'm assuming that your input data is binary (not text), and that you want to extract chunks of binary data from it. All without making a copy of your input data.

You can combine boost::iostreams::basic_array_source and boost::iostreams::stream_buffer (from Boost.Iostreams) with boost::archive::binary_iarchive (from Boost.Serialization) to be able to use convenient extraction >> operators to read chunks of binary data.

#include <stdint.h>
#include <iostream>
#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/stream.hpp>
#include <boost/archive/binary_iarchive.hpp>

int main()
{
    uint16_t data[] = {1234, 5678};
    char* dataPtr = (char*)&data;

    typedef boost::iostreams::basic_array_source<char> Device;
    boost::iostreams::stream_buffer<Device> buffer(dataPtr, sizeof(data));
    boost::archive::binary_iarchive archive(buffer, boost::archive::no_header);

    uint16_t word1, word2;
    archive >> word1 >> word2;
    std::cout << word1 << "," << word2 << std::endl;
    return 0;
}

With GCC 4.4.1 on AMD64, it outputs:

1234,5678

Boost.Serialization is very powerful and knows how to serialize all basic types, strings, and even STL containers. You can easily make your types serializable. See the documentation. Hidden somewhere in the Boost.Serialization sources is an example of a portable binary archive that knows how to perform the proper swapping for your machine's endianness. This might be useful to you as well.

If you don't need the fanciness of Boost.Serialization and are happy to read the binary data in an fread()-type fashion, you can use basic_array_source in a simpler way:

#include <stdint.h>
#include <iostream>
#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/stream.hpp>

int main()
{
    uint16_t data[] = {1234, 5678};
    char* dataPtr = (char*)&data;

    typedef boost::iostreams::basic_array_source<char> Device;
    boost::iostreams::stream<Device> stream(dataPtr, sizeof(data));

    uint16_t word1, word2;
    stream.read((char*)&word1, sizeof(word1));
    stream.read((char*)&word2, sizeof(word2));
    std::cout << word1 << "," << word2 << std::endl;

    return 0;
}

I get the same output with this program.

Strychnine answered 17/1, 2010 at 5:20 Comment(3)
Great stuff, Emile. Maybe it's not "simpler", but it's more generic for sure. Thank you!Boneblack
The reason only fstream objects have the concept of binary is that text and binary modes are exactly the same; apart from how they handle end of line. In terms of actions for stream operations they are no different. The way you are using the stream operators to try and read integers from a char stream is not how the stream operators are supposed to work.Placement
I just want to register disagreement with Loki. The << and >> operators are not "supposed to work" the same when used on other data types. If you go in that direction, you'd have no choice but to conclude that iostreams using them as "stream operators" is broken, because that's not how the bitshift operators (yes, that's their true name) are supposed to work.Ravishment
T
6

I'm not sure what you need, but does this do what you want?

char *mybuffer;
size_t length;
// allocate, fill, set length, as before

std::string data(mybuffer, length);
std::istringstream mb(data);
//use mb
Topic answered 17/1, 2010 at 4:22 Comment(5)
No, this will truncate the string to the first occurrence of 0x00 byte in the buffer. I specifically need to create a fixed-length stream that I can read binary data from, when I have this data in a known place in memory.Boneblack
I don't see why that would handle NUL bytes specially. Notice length being passed to std::string constructor separately.Photocell
Are you sure? The string constructor that takes (char*, size_t) does not treat the char* param as a c-style (null-terminated) string. It uses the length you give it.Topic
Yes, sorry, you're right. I was misled by VS debugger visualizer. It truncated the string on NUL character. The remainder of data is put into string.Boneblack
While your approach gives me what I need, it does have a side effect of the input data being copied into the data string. I would like to avoid that. Sorry for not making this clear earlier.Boneblack
P
4

The standard stream buffer has this functionality.
Create a stream. Gets its buffer then over-ride it.

#include <sstream>
#include <iostream>
#include <algorithm>
#include <iterator>

int main()
{
    // Your imaginary buffer
    char    buffer[]    = "A large buffer we don't want to copy but use in a stream";

    // An ordinary stream.
    std::stringstream   str;

    // Get the streams buffer object. Reset the actual buffer being used.
    str.rdbuf()->pubsetbuf(buffer,sizeof(buffer));

    std::copy(std::istreambuf_iterator<char>(str),
              std::istreambuf_iterator<char>(),
              std::ostream_iterator<char>(std::cout)
             );
}
Placement answered 17/1, 2010 at 5:17 Comment(7)
Hmmm... I put this to test, and it doesn't seem to work under MSVC and its libs. It works on gcc, though.Boneblack
Looks like MSVC's basic_streambuf::pubsetbuf doesn't do anything, at all. #1494682Boneblack
And it can't be even called a bug, as C++ standards don't define what is the expected behavior of calling setbuf: open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1905.pdf - Appendix D.7.26Boneblack
@ybungalobill: Yes so. Implementation defined does not mean undefined there is a big difference. Implementation defined means the implementers have some lee way but it still works (as expected).Placement
@Martin: You are just fortunate it works on your implementation that defined it this way. Mine implementation, and @Marcin's too, doesn't implement it this way. Our implementation is still a standard conforming implementation.Curly
@ybungalobill: You do realize that the section mentioned D.7.26 is for the deprecates strstreambuf not the stringstream_buffer stream buffer that we are using above. Read: 27.5.2.4.2Placement
@Martin: @ybungalobill is correct. It's section 27.8.1.4 in the FCD, and it's still implementation-defined for all combinations of parameters except setbuf(0, 0).Ravishment
T
2

The questioner wanted something that doesn't copy the data, and his solution works fine. My contribution is to clean it up a little, so you can just create a single object that's an input stream for data in memory. I have tested this and it works.

class MemoryInputStream: public std::istream
    {
    public:
    MemoryInputStream(const uint8_t* aData,size_t aLength):
        std::istream(&m_buffer),
        m_buffer(aData,aLength)
        {
        rdbuf(&m_buffer); // reset the buffer after it has been properly constructed
        }

    private:
    class MemoryBuffer: public std::basic_streambuf<char>
        {
        public:
        MemoryBuffer(const uint8_t* aData,size_t aLength)
            {
            setg((char*)aData,(char*)aData,(char*)aData + aLength);
            }
        };

    MemoryBuffer m_buffer;
    };
Tewfik answered 16/5, 2012 at 9:18 Comment(1)
Damn - it doesn't work because seekg, etc., won't work by default - you have to override various streambuf functions.Tewfik

© 2022 - 2024 — McMap. All rights reserved.