Are std::fill, std::copy specialized for std::vector<bool>?

Asked 14/9, 2012 at 23:18 Answered 26/9, 2020 at 7:59

Solved c++vector stl-algorithm bitvector

When thinking about this question I start to wondering if std::copy() and/or std::fill are specialized (I really mean optimized) for std::vector<bool>.

Is this required by C++ standard or, perhaps, it is common approach by C++ std library vendors?

Simple speaking, I wonder to know if the following code:

std::vector<bool> v(10, false);
std::fill(v.begin(), v.end(), true);

is in any way better/different than that:

std::vector<bool> v(10, false);
for (auto it = v.begin(); it != v.end(); ++it) *it = true;

To be very strict - can, let say: std::fill<std::vector<bool>::iterator>() go into internal representation of std::vector<bool> and sets their entire bytes instead of single bits? I assume making std::fill friend of std::vector<bool> is not a big problem for library vendor?

[UPDATE]

Next related question: can I (or anybody else :) specialize such algorithms for let say std::vector<bool>, if not already specialized? Is this allowed by C++ standard? I know this will be non portable - but just for one selected std C++ library? Assuming I (or anybody else) find a way to get to std::vector<bool> private parts.

Morna answered 14/9, 2012 at 23:18 Comment(5)

It's not required. It is however, allowed. I have no idea if any vendor does it, though. – Theater 14/9, 2012 at 23:35

for the most part it's not safe to specialize most things in the standard library. You could, however, make a fill function in any other namespace and optimize that for vector<bool> – Chopfallen 14/9, 2012 at 23:50

Only tangentially related, but you might be interested in this and this question/answer I posted sometime ago. – Overwrought 15/9, 2012 at 5:14

Coming late to this question: See the article: isocpp.org/blog/2012/11/on-vectorbool The article was written using libc++ (libcxx.llvm.org). The article goes over several generic std::algorithms demonstrating what can be accomplished if they are optimized for vector<bool>. – Padraig 26/2, 2013 at 0:35

@HowardHinnant thanks, very interesting. I did not know the exact numbers, but expected similar results. Tthat is why I asked my question. – Morna 26/2, 2013 at 10:35

STD is headers only library and it is shipped with your compiler. You can look into those headers yourself. For GCC's vector<bool> impelemtation is in stl_bvector.h. It probably will be the same file for other compilers too. And yes, there is specialized fill (look near __fill_bvector).

Shaffert answered 15/9, 2012 at 5:7 Comment(0)

Optimizations are nowhere mandated in the standard. It is assumed to be a "quality of implementation" issue if an optimization could applied. The asymptotic complexity of most algorithms is, however, restricted.

Optimizations are allowed as long as a correct program behaves according to what the standard mandates. The examples you ask about, i.e., optimizations involving standard algorithms using iterators on std::vector<bool>, can achieve their objective pretty much in any way the implementation sees fit because there is no way to monitor how they are implemented. This said, I doubt very much that there is any standard library implementation optimizing operations on std::vector<bool>. Most people seem to think that this specialization is an abomination in the first place and that it should go away.

A user is only allowed to create specializations of library types if the specialization involves at least one user defined type. I don't think a user is allowed to provide any function in namespace std at all: There isn't any needs because all such functions would involve a user defined type and would, thus, be found in the user's namespace. Formulated differently: I think you are out of luck with respect to getting algoritms optimized for std::vector<bool> for the time being. You might consider contributing optimized versions to the open source implementations (e.g., libstdc++ and libc++), however.

Stylolite answered 15/9, 2012 at 1:5 Comment(0)

There is no specialization for it, but you can still use it. (even though it's slow)

But here is a trick I found which enables std::fill on std::vector<bool>, using proxy class std::_Vbase.

(WARNING: I've tested it only for MSVC2013, so it may not work on other compilers.)

int num_bits = 100000;
std::vector<bool> bit_set(num_bits , true);

int bitsize_elem = sizeof(std::_Vbase) * 8; // 1byte = 8bits
    
int num_elems = static_cast<int>(std::ceil(num_bits / static_cast<double>(bitsize_elem)));

Here, since you need the whole bits of an element if you use any bit of it, the number of elements must be rounded up.

Using this information, we will build a vector of pointers that pointing the original elements underlying the bits.

std::vector<std::_Vbase*> elem_ptrs(num_elems, nullptr);

std::vector<bool>::iterator bitset_iter = bit_set.begin();
for (int i = 0; i < num_elems; ++i)
{
    std::_Vbase* elem_ptr = const_cast<std::_Vbase*>((*bitset_iter)._Myptr);
    elem_ptrs[i] = elem_ptr;
    std::advance(bitset_iter, bitsize_elem);
}

(*bitset_iter)._Myptr : By dereferencing the iterator of std::vector<bool>, you can access the proxy class reference and its member _Myptr.

Since the return type of std::vector<bool>::iterator::operator*() is const std::_Vbase*, remove the constness of it by const_cast.

Now we get the pointer which is pointing the original element underlying those bits, std::_Vbase* elem_ptr.

elem_ptrs[i] = elem_ptr : Record this pointer,...

std::advance(bitset_iter, bitsize_elem) : ...and continue our journey to find the next element, by jumping bits held by the previous element.

std::fill(elem_ptrs[0], elem_ptrs[0] + num_elems, 0); // fill every bits "false"
std::fill(elem_ptrs[0], elem_ptrs[0] + num_elems, -1); // fill every bits "true"

Now, we can use std::fill on the vector of pointers, rather than vector of bits.

Perhaps some may feel uncomfortable using the proxy class externally and even remove the constness of it.

But if you don't care about that and want something fast, this is the fastest way.

I did some comparisons below. (made new project, nothing changed config, release, x64)

int it_max = 10; // do it 10 times ...
int num_bits = std::numeric_limits<int>::max(); // 2147483647

std::vector<bool> bit_set(num_bits, true);
for (int it_count = 0; it_count < it_max; ++it_count)
{
    std::fill(elem_ptrs[0], elem_ptrs[0] + num_elems, 0);
} // Elapse Time : 0.397sec

for (int it_count = 0; it_count < it_max; ++it_count)
{
    std::fill(bit_set.begin(), bit_set.end(), false);
} // Elapse Time : 18.734sec

for (int it_count = 0; it_count < it_max; ++it_count)
{
    for (int i = 0; i < num_bits; ++i)
    {
        bit_set[i] = false;
    }
} // Elapse Time : 21.498sec

for (int it_count = 0; it_count < it_max; ++it_count)
{
    bit_set.assign(num_bits, false);
} // Elapse Time : 21.779sec

for (int it_count = 0; it_count < it_max; ++it_count)
{
    bit_set.swap(std::vector<bool>(num_bits, false)); // You can not use elem_ptrs anymore
} // Elapse Time : 1.3sec

There is one caveat. When you swap() the original vector with another one, then the vector of pointers becomes useless!

Centering answered 26/9, 2020 at 7:59 Comment(0)

23.2.5 Class vector from the C++ International Standard goes as far as to tell us

To optimize space allocation, a specialization of vector for bool elements is provided:

after which the bitset specialization is provided. That's as far as the standard goes regarding vector<bool>, vendors need to implement it using a bitset to optimize for space. Optimizing for space comes with a cost here, as to not optimize for speed.

It's easier to get a book from the library than it is to find a book if it were between all the library books stapled closely together in containers....

Take your example, you're trying to do a std::fill or std::copy from begin to end. But that's not always the case, sometimes it doen't just simply map to an entire byte. So, that's a bit of a problem in terms of speed optimization. It's easy for the case you'd have to change every bit to one, that's just changing the bytes to 0xF, but that's not the case here; it becomes much harder if you were to only changes certain bits of a byte. Then you'll need to actually compute what the byte will be; that's not a trivial thing to do^*, or at least not as an atomic operation on current hardware.

It's the premature optimization story, it's nice in terms of space but horrible in terms of performance.

Is having a "is a multiple of 8 bits" check worth the overhead? I doubt it.

_{* We're talking about multiple bits here, for the case it's just one bit you can of course do a bit operation.}

Cupel answered 15/9, 2012 at 1:0 Comment(1)

Frankly, I thought about more complicated cases where start,stop are not equal to begin(), end(). – Morna 15/9, 2012 at 6:19

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags