Which is a better c++ container for holding and accessing binary data?
std::vector<unsigned char>
or
std::string
Is one more efficient than the other?
Is one a more 'correct' usage?
Which is a better c++ container for holding and accessing binary data?
std::vector<unsigned char>
or
std::string
Is one more efficient than the other?
Is one a more 'correct' usage?
You should prefer std::vector
over std::string
. In common cases both solutions can be almost equivalent, but std::string
s are designed specifically for strings and string manipulation and that is not your intended use.
char_traits<char>
and with the standard specialization, assignment, comparisons and ordering are defined as the equivalent for the built-in char type. –
Disrelish std::vector
better suites what you want from a buffer, so if only because of the intention is clearer (as fnieto points out in his answer) I would prefer std::vector
–
Disrelish Both are correct and equally efficient. Using one of those instead of a plain array is only to ease memory management and passing them as argument.
I use vector because the intention is more clear than with string.
Edit: C++03 standard does not guarantee std::basic_string
memory contiguity. However from a practical viewpoint, there are no commercial non-contiguous implementations. C++0x is set to standardize that fact.
Is one more efficient than the other?
This is the wrong question.
Is one a more 'correct' usage?
This is the correct question.
It depends. How is the data being used? If you are going to use the data in a string like fashon then you should opt for std::string as using a std::vector may confuse subsequent maintainers. If on the other hand most of the data manipulation looks like plain maths or vector like then a std::vector is more appropriate.
For the longest time I agreed with most answers here. However, just today it hit me why it might be more wise to actually use std::string
over std::vector<unsigned char>
.
As most agree, using either one will work just fine. But often times, file data can actually be in text format (more common now with XML having become mainstream). This makes it easy to view in the debugger when it becomes pertinent (and these debuggers will often let you navigate the bytes of the string anyway). But more importantly, many existing functions that can be used on a string, could easily be used on file/binary data. I've found myself writing multiple functions to handle both strings and byte arrays, and realized how pointless it all was.
This is a comment to dribeas answer. I write it as an answer to be able to format the code.
This is the char_traits compare function, and the behaviour is quite healthy:
static bool
lt(const char_type& __c1, const char_type& __c2)
{ return __c1 < __c2; }
template<typename _CharT>
int
char_traits<_CharT>::
compare(const char_type* __s1, const char_type* __s2, std::size_t __n)
{
for (std::size_t __i = 0; __i < __n; ++__i)
if (lt(__s1[__i], __s2[__i]))
return -1;
else if (lt(__s2[__i], __s1[__i]))
return 1;
return 0;
}
assign
, eq
and lt
must be defined as builtin operators =, == and < for type char
. –
Disrelish As far as readability is concerned, I prefer std::vector. std::vector should be the default container in this case: the intent is clearer and as was already said by other answers, on most implementations, it is also more efficient.
On one occasion I did prefer std::string over std::vector though. Let's look at the signatures of their move constructors in C++11:
string (string&& str) noexcept;
On that occasion I really needed a noexcept move constructor. std::string provides it and std::vector does not.
If you just want to store your binary data, you can use bitset
which optimizes for space allocation. Otherwise go for vector
, as it's more appropriate for your usage.
vector
is better. –
Bottomry bitset
is efficient at storing binary data - I never said bitset
was an STL container. And creating that "pretty big string" (which would use unsigned char
, btw) is trivial. Also, everything I've seen till now (sample code on my compiler, Googling and Effective STL (pg.70)) indicates that bitset does store binary data effectively. And yes, there is a better way to store binary data, and it's bitset
- have you tried it out on your compiler? It's only two lines of code. –
Bottomry not
since it's the same as storing the data, you can use bitset::to_string
. And yes, you need a 10 MB string - that's the whole point of using bitset. Suppose you have a array of bits which you've obtained as unsigned chars after some logical operation perhaps, and it's 10MB and you want to store it in memory - what do you do? bitset
! –
Bottomry std::vector<bool>
is an specialization that is optimized for space, not that the standard committee is happy about it), so at that point it won't even take more memory than a bitset. –
Disrelish Compare this 2 and choose yourself which is more specific for you. Both are very robust, working with STL algorithms ... Choose yourself wich is more effective for your task
Personally I prefer std::string because string::data() is much more intuitive for me when I want my binary buffer back in C-compatible form. I know that vector elements are guaranteed to be stored contiguously exercising this in code feels a little bit unsettling.
This is a style decision that individual developer or a team should make for themselves.
© 2022 - 2024 — McMap. All rights reserved.