How to dump std::vector<bool> in a binary file?
Asked Answered
D

3

6

I write tools to dump and load common objects in a binary file. In a first quick implementation, I wrote the following code for std::vector<bool>. It works, but it is clearly not optimized in memory.

template <>
void binary_write(std::ofstream& fout, const std::vector<bool>& x)
{
    std::size_t n = x.size();
    fout.write((const char*)&n, sizeof(std::size_t));
    for(std::size_t i = 0; i < n; ++i)
    {
        bool xati = x.at(i);
        binary_write(fout, xati);
    }
}

template <>
void binary_read(std::ifstream& fin, std::vector<bool>& x)
{
    std::size_t n;
    fin.read((char*)&n, sizeof(std::size_t));
    x.resize(n);
    for(std::size_t i = 0; i < n; ++i)
    {
        bool xati;
        binary_read(fin, xati);
        x.at(i) = xati;
    }
}

How can I copy the internal memory of a std::vector<bool> in my stream ?

Note : I don't want to replace std::vector<bool> by something other.

Deficiency answered 14/4, 2015 at 9:19 Comment(6)
Even if you are already using std::vector<bool> elsewhere in the code, I strongly suggest you move to something like std::bitset or boost::dynamic_bitset and use their to_string functionality, or their ostream overloads of operator<<.Antipodes
to_string for a binary storage ? Really ? ^^Deficiency
Right, not my smartest comment ;). Still, after looking up the functionality of std::bitset, that seems like the only way to go (bitset->string->integer of some kind). That, or fetching the bits one by one. I'm curious which would be faster... Hmm on second thought, just stick with std::vector<bool> (see e.g. this question)Antipodes
Make data persistent is the job of a serializer. No need to handcraft that.Caesarea
@Klaus: write a serializer with specific needs is my job. I don't need judgment on the relevance of the question. I need solutions. ;-)Deficiency
in first comment of this answer, as pointted out by that user vector<bool> doesn't have a contiguous memory storage of bools)Gaby
D
3

Answering my own question, currently validated as the best answer, but it can change if someone provides somthing better.

A way to do that is the following. It requires to access each value, but it works.

template <>
void binary_write(std::ofstream& fout, const std::vector<bool>& x)
{
    std::vector<bool>::size_type n = x.size();
    fout.write((const char*)&n, sizeof(std::vector<bool>::size_type));
    for(std::vector<bool>::size_type i = 0; i < n;)
    {
        unsigned char aggr = 0;
        for(unsigned char mask = 1; mask > 0 && i < n; ++i, mask <<= 1)
            if(x.at(i))
                aggr |= mask;
        fout.write((const char*)&aggr, sizeof(unsigned char));
    }
}

template <>
void binary_read(std::ifstream& fin, std::vector<bool>& x)
{
    std::vector<bool>::size_type n;
    fin.read((char*)&n, sizeof(std::vector<bool>::size_type));
    x.resize(n);
    for(std::vector<bool>::size_type i = 0; i < n;)
    {
        unsigned char aggr;
        fin.read((char*)&aggr, sizeof(unsigned char));
        for(unsigned char mask = 1; mask > 0 && i < n; ++i, mask <<= 1)
            x.at(i) = aggr & mask;
    }
}
Deficiency answered 14/4, 2015 at 13:12 Comment(3)
Writing the size like that isn't endian safe. Also the size of an vector<bool> is a std::vector<bool>::size_type which isn't necessarily the same as an unsigned int.Gardel
Your right, but it seems quite impossible to have a size of an existing vector that overflow an unsigned long int, because unsigned long int >= void* >= size of the RAM >= size of the vector. In this specific case, bits are counted, then the number of elements coud be greater than the RAM, but the template constraint let met think that it's the same integral type used for all the vectors. In my case size_type is a size_t which is an uint on 64 bits, which is the same than unsigned long int. The bug can occur only for hypothetic compilers with a really huge vector of bools.Deficiency
I modified the answer according to your comment. I hate to use size types when it's impacting the code out of the scope (for example when correlated to data from the user), because it become really unclear to manage for the developper. But it's not the case here.Deficiency
C
1

Sorry but the answer is you can't do this portably.

To do this non-portably, you can write a function specific to your standard library implementation's iterators for vector<bool>.

If you're lucky, the relevant fields will be public inside the iterators, so you don't have to change private to public.

Contagious answered 14/4, 2015 at 9:27 Comment(2)
Actually, I can do this portably by agregating 8 values in 1 byte, and store this byte in my file. But I prefer a nice solution. :-)Deficiency
@Caduchon: You have to access the vector's bits individually, though. My point was that you can't avoid that.Contagious
B
-1

Just use this bitstream:
https://github.com/redmms/FineStream
It will write vector<bool> as bits:

#include "finestream/finestream.h" // or "import finestream;"
fsm::ofinestream stream("output.txt");
vector <bool> v{true, false, true, false};
stream << v;

Then, if you need to read it back:

#include "finestream/finestream.h" // or "import finestream;"
fsm::ifinestream stream("output.txt");
vector <bool> v(4);
stream >> v;
Bewhiskered answered 24/1 at 14:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.