If memcmp is equivalent to by-member equality comparisons, should you prefer implementing operator== using memcmp?
Asked Answered
A

7

24

Precondition: Consider such a class or struct T, that for two objects a and b of type T

memcmp(&a, &b, sizeof(T)) == 0

yields the same result as

a.member1 == b.member1 && a.member2 == b.member2 && ...

(memberN is a non-static member variable of T).

Question: When should memcmp be used to compare a and b for equality, and when should the chained ==s be used?


Here's a simple example:

struct vector
{
    int x, y;
};

To overload operator == for vector, there are two possibilities (if they're guaranteed to give the same result):

bool operator==(vector lhs, vector rhs)
{ return lhs.x == rhs.x && lhs.y == rhs.y; }

or

bool operator==(vector lhs, vector rhs)
{ return memcmp(&lhs, &rhs, sizeof(vector)) == 0; }

Now if a new member were to be added to vector, for example a z component:

  • If ==s were used to implement operator==, it would have to be modified.
  • If memcmp was used instead, operator== wouldn't have to be modified at all.

But I think using chained ==s conveys a clearer meaning. Although for a large T with many members memcmp is more tempting. Additionally, is there a performance improvement from using memcmp over ==s? Anything else to consider?

Ascender answered 4/3, 2015 at 15:31 Comment(17)
It would be nice if compilers could automatically generate operator== which calls == on all its members automatically. But they don't. :(Friel
"Premature optimization is the root of all evil. "- KnuthTwostep
Is this a question about usabilty or what are you asking about? As if I get you right, both could be used, but had to be clarified in which way == will be working.Pelage
The point is that when you add z to the struct, you're supposed to actively decide whether or not z participates in the value for equality purposes, and modify operator== or not accordingly, along with comparisons and hashcodes. So for example in vector the capacity doesn't participate in equality comparison (and of course there are plenty of other reasons why memcmp isn't even close to working for vector!). Yes, it would be nice to be able to = default it, but even if your T has this property, it won't necessarily still have the property after adding z.Monopode
Your original assumption (guarantee of same result) is impossible for the generic usecase. Even if you can guarantee it for a specific platform (HW+OS+compiler+...) this would become one huge issue if you would have to change platforms some time in the future.Megaton
"memberN is a non-static member variable of T. I.e. there's no padding, etc." The second part does not follow from the first. What makes you think, outside of non-standard settings on certain compilers, you can assume non-static members won't be padded?Ancilla
@PaulRoub The last sentence refers to the whole sentence before it, not just the member variable part. Is there a way I can make it clearer?Ascender
@Cheersandhth.-Alf: That quote is overused and grossly misunderstood. Splashing it on every performance-related question you see (is this even one of those?) does not magically mean that we should go around writing inefficient code for the heck of it. It relates to more extreme cases than that.Degeneracy
@zenith ... and it's still a faulty assumption. Change compilers, go from debug to optimized code and lose zero-filling, compile for 64-bit instead of 32-bit... and the assumption goes out the window. Your premise is quite clear, and dangerous -- but it's apparently very important to you, so I give up. You asked if there's "anything else to consider". Yes, there is. Whether you actually consider it is up to you.Ancilla
@Megaton That's a good point, although not relevant in my case. Would make a nice answer imo.Ascender
Note: at least in C, padding bytes are guaranteed to be zero in the case where the struct has static storage duration, see C11 6.7.9/10. If it has automatic storage, then padding bytes can contain anything. So there are (rare) cases where memcmp() will be perfectly safe, given that you know the storage duration of the struct.Burgle
@PaulRoub It's not an assumption. It's a supposition/conjecture. I'm only talking about cases where that is true. I'm aware that in reality it's not true everywhere.Ascender
A decent compiler will optimize a chain of == to a memcmp if he knows there is no padding. So performance-wise they are going to be the same, but operator== is portable.Phipps
@Phipps My "decent compiler" unrolls the comparisons when using memcmp instead of producing a loop and comparing each byte, if possible.Churchlike
@Churchlike it has a good reason to do so. Unrolling the loop means faster execution with bigger executable size, that is usually the default. The limit where the compiler decides to unroll a loop is decided by a lot of factor, but probably compiling with size optimizations will produce a memcmp/loop.Phipps
memcmp is a lexicographical compare, no? Therefore, it is ill-suited to compare structs containing types like int and long, which can be compared word by word instead of byte by byte.Gullett
@RedAlert: While some compilers will interpret every memcmp as a call to a compare-memory method, better compilers regard it as an intrinsic, and will only call a general-purpose library method in cases where nothing better is available. Given int32_t i,j; float f,g;, a good compiler may turn if (memcmp(&i,&j,4)) into if (i!=j), and if (memcmp(&f,&g,4)) into if (*(int32_t*)&f != *(int32_t*)&g).Gatefold
T
17

Regarding the precondition of memcmp yielding the same result as member-wise comparisons with ==, while this precondition is often fulfilled in practice, it's somewhat brittle.

Changing compilers or compiler options can in theory break that precondition. Of more concern, code maintenance (and 80% of all programming work is maintenance, IIRC) can break it by adding or removing members, making the class polymorphic, adding custom == overloads, etc. And as mentioned in one of the comments, the precondition can hold for static variables while it doesn't hold for automatic variables, and then maintenance work that creates non-static objects can do Bad Things™.

And regarding the question of whether to use memcmp or member-wise == to implement an == operator for the class, first, this is a false dichotomy, for those are not the only options.

For example, it can be less work and more maintainable to use automatic generation of relational operator overloads, in terms of a compare function. The std::string::compare function is an example of such a function.

Secondly, the answer to what implementation to choose depends strongly on what you consider important, e.g.:

  • should one seek to maximize runtime efficiency, or

  • should one seek to create clearest code, or

  • should one seek the most terse, fastest to write code, or

  • should one seek to make the class most safe to use, or

  • something else, perhaps?

Generating relational operators.

You may have heard of CRTP, the Curiously Recurring Template Pattern. As I recall it was invented to deal with the requirement of generating relational operator overloads. I may possibly be conflating that with something else, though, but anyway:

template< class Derived >
struct Relops_from_compare
{
    friend
    auto operator!=( const Derived& a, const Derived& b )
        -> bool
    { return compare( a, b ) != 0; }

    friend
    auto operator<( const Derived& a, const Derived& b )
        -> bool
    { return compare( a, b ) < 0; }

    friend
    auto operator<=( const Derived& a, const Derived& b )
        -> bool
    { return compare( a, b ) <= 0; }

    friend
    auto operator==( const Derived& a, const Derived& b )
        -> bool
    { return compare( a, b ) == 0; }

    friend
    auto operator>=( const Derived& a, const Derived& b )
        -> bool
    { return compare( a, b ) >= 0; }

    friend
    auto operator>( const Derived& a, const Derived& b )
        -> bool
    { return compare( a, b ) > 0; }
};

Given the above support, we can investigate the options available for your question.

Implementation A: comparison by subtraction.

This is a class providing a full set of relational operators without using either memcmp or ==:

struct Vector
    : Relops_from_compare< Vector >
{
    int x, y, z;

    // This implementation assumes no overflow occurs.
    friend
    auto compare( const Vector& a, const Vector& b )
        -> int
    {
        if( const auto r = a.x - b.x ) { return r; }
        if( const auto r = a.y - b.y ) { return r; }
        return a.z - b.z;
    }

    Vector( const int _x, const int _y, const int _z )
        : x( _x ), y( _y ), z( _z )
    {}
};

Implementation B: comparison via memcmp.

This is the same class implemented using memcmp; I think you'll agree that this code scales better and is simpler:

struct Vector
    : Relops_from_compare< Vector >
{
    int x, y, z;

    // This implementation requires that there is no padding.
    // Also, it doesn't deal with negative numbers for < or >.
    friend
    auto compare( const Vector& a, const Vector& b )
        -> int
    {
        static_assert( sizeof( Vector ) == 3*sizeof( x ), "!" );
        return memcmp( &a, &b, sizeof( Vector ) );
    }

    Vector( const int _x, const int _y, const int _z )
        : x( _x ), y( _y ), z( _z )
    {}
};

Implementation C: comparison member by member.

This is an implementation using member-wise comparisons. It doesn't impose any special requirements or assumptions. But it's more source code.

struct Vector
    : Relops_from_compare< Vector >
{
    int x, y, z;

    friend
    auto compare( const Vector& a, const Vector& b )
        -> int
    {
        if( a.x < b.x ) { return -1; }
        if( a.x > b.x ) { return +1; }
        if( a.y < b.y ) { return -1; }
        if( a.y > b.y ) { return +1; }
        if( a.z < b.z ) { return -1; }
        if( a.z > b.z ) { return +1; }
        return 0;
    }

    Vector( const int _x, const int _y, const int _z )
        : x( _x ), y( _y ), z( _z )
    {}
};

Implementation D: compare in terms of relational operators.

This is an implementation sort of reversing the natural order of things, by implementing compare in terms of < and ==, which are provided directly and implemented in terms of std::tuple comparisons (using std::tie).

struct Vector
{
    int x, y, z;

    friend
    auto operator<( const Vector& a, const Vector& b )
        -> bool
    {
        using std::tie;
        return tie( a.x, a.y, a.z ) < tie( b.x, b.y, b.z );
    }

    friend
    auto operator==( const Vector& a, const Vector& b )
        -> bool
    {
        using std::tie;
        return tie( a.x, a.y, a.z ) == tie( b.x, b.y, b.z );
    }

    friend
    auto compare( const Vector& a, const Vector& b )
        -> int
    {
        return (a < b? -1 : a == b? 0 : +1);
    }

    Vector( const int _x, const int _y, const int _z )
        : x( _x ), y( _y ), z( _z )
    {}
};

As given, client code using e.g. > needs a using namespace std::rel_ops;.

Alternatives include adding all other operators to the above (much more code), or using a CRTP operator generation scheme that implements the other operators in terms of < and = (possibly inefficiently).

Implementation E: comparision by manual use of < and ==.

This implementation is the result not applying any abstraction, just banging away at the keyboard and writing directly what the machine should do:

struct Vector
{
    int x, y, z;

    friend
    auto operator<( const Vector& a, const Vector& b )
        -> bool
    {
        return (
            a.x < b.x ||
            a.x == b.x && (
                a.y < b.y ||
                a.y == b.y && (
                    a.z < b.z
                    )
                )
            );
    }

    friend
    auto operator==( const Vector& a, const Vector& b )
        -> bool
    {
        return
            a.x == b.x &&
            a.y == b.y &&
            a.z == b.z;
    }

    friend
    auto compare( const Vector& a, const Vector& b )
        -> int
    {
        return (a < b? -1 : a == b? 0 : +1);
    }

    Vector( const int _x, const int _y, const int _z )
        : x( _x ), y( _y ), z( _z )
    {}
};

What to choose.

Considering the list of possible aspects to value most, like safety, clarity, efficiency, shortness, evaluate each approach above.

Then choose the one that to you is clearly best, or one of the approaches that seem about equally best.

Guidance: For safety you would not want to choose approach A, subtraction, since it relies on an assumption about the values. Note that also option B, memcmp, is unsafe as an implementation for the general case, but can do well for just == and !=. For efficiency you should better MEASURE, with relevant compiler options and environment, and remember Donald Knuth's adage: “premature optimization is the root of all evil” (i.e. spending time on that may be counter-productive).

Twostep answered 4/3, 2015 at 17:48 Comment(3)
"80% of all programming work is maintenance, IIRC" If you remember correctly? Have you given up programming, or are you amnesic?!Rufina
@bcrist: Wikipedia (checking that now) cites a 1997 study for the 80% figure. So as it happens I did remember that correctly, from ~18 years ago. It probably still holds. Re the other more personal questions, no I haven't given up programming, and hopefully I'm not … what was that, again?Twostep
The relational operator generation thing is the Barton-Nackman trick. Boost.Operators provides an implementation.Draughtboard
D
12

If, as you say, you've chosen types such that the two solutions yield the same results (presumably, then, you have no indirect data and the alignment/padding is all the same), then clearly you can use whichever solution you like.

Things to consider:

  1. Performance: I doubt you'll see much if any difference, but measure it to be sure, if you care;
  2. Safety: Well you say the two solutions are the same for your T, but are they? Are they really? On all systems? Is your memcmp approach portable? Probably not;
  3. Clarity: If your preconditions ever change and you did not adequately comment-describe your memcmp usage, then your program is liable to break — you've therefore made it fragile;
  4. Consistency: Presumably you use == elsewhere; certainly you'll have to do it for every T that doesn't meet your preconditions; unless this is a deliberate optimising specialisation for T, you may consider sticking to a single approach throughout your program;
  5. Ease of use: Of course, it's pretty easy to miss out a member from chained ==, especially if your list of members ever grows.
Degeneracy answered 4/3, 2015 at 15:38 Comment(4)
I've been wishing C++ had a way to write syntax like operator ==() = pass_to_members or something like that.Erode
@sashoalm: Heh, yeah, that'd be nice. Parameter pack expansion sorts out the call to operator==, but you can't use it to grab a list of members. The feature just isn't there.Degeneracy
Even if the padding is the same, are you guaranteed that the content of the padding would be the same? In which case your memcmp would fail.Paramour
ease-of-use is not the same as being forgiving for mistakes which your explanation describes. Less places to change is ease-of-use because it makes implementing changes less cumbersome, however ensuring the correctness of implementation is a basic requirement, which ideally should be enforced by methodology.Heiskell
N
6

If two solutions are both correct, prefer the more readable one. I'd say that for a C++ programmer, == is more readable than memcmp. I would go so far as to use std::tie instead of chaining:

bool operator==(const vector &lhs, const vector &rhs)
{ return std::tie(lhs.x, lhs.y) == std::tie(rhs.x, rhs.y); }
Nathanialnathaniel answered 4/3, 2015 at 15:40 Comment(4)
I prefer the original approach because you can line it up / align it better in the code if there are lots of members. a.x == b.x \n && a.y == b.y \n && a.z == b.z reads better than std::tie(a.x, a.y, a.z) == std::tie(b.x, b.y, b.z) no matter where you put newlines in the latter example, I think.Degeneracy
std::tie(a.x, a.y, a.z) \n == std::tie(b.x, b.y, b.z)?Aleman
@R.MartinhoFernandes: 'Too wide' for more than a few members (whereas 'too high' for the other approach is not as much of a problem: we are used to vertical scrolling)Degeneracy
Better still, wrap the tie in a little function so you don't have to duplicate the member names or line anything up: return members(lhs) == members(rhs);Cosmotron
C
4

If any only if the structure is POD and if it is safely memcmp comparable (not even all numeric types are...) the result is the same and the question is about readability and performance.

Readability? This is a rather opinion based question I think but I'd prefer operator==.

Performance? operator== is a short-circuit operator. You have more control over your program here because you can reorder the comparison sequence.

Although a == b && c == d and c == d && a == b are equivalent in terms of algorithmic logic (result is the same) they aren't equivalent in terms of produced assembly, "background logic" and possibly performance.

You can influence your program if you can forsee some points.

In example:

  • If both statements are roughly equally likely to yield false, you'll want to have the cheaper statement first to skip the more complex comparison if possible.
  • If both statements are roughly equally complex and you know in advance that c == d is more likely to be false than a == b, you should compare c and d first.

It is possible to adjust the comparison sequence in a problem-dependant fashion using operator== whereas memcmp does not give you this kind of freedom.

PS: You would want to measure it but for a small struct with 3 members, MS VS 2013 produces slightly more complex assembly for the memcmp case. I'd expect the operator== solution to have a higher performace (if the impact would be measurable) in this case.

-/edith-

Note: Even POD struct members can have overloaded operator==.

Consider:

#include <iostream>
#include <iomanip>

struct A { int * p; };

bool operator== (A const &a, A const &b) { return *(a.p) == *(b.p); }

struct B { A m; };

bool operator== (B const &a, B const &b) { return a.m == b.m; }

int main()
{
  int a(1), b(1);
  B x, y;
  x.m.p = &a;
  y.m.p = &b;
  std::cout << std::boolalpha;
  std::cout << (memcmp(&x, &y, sizeof(B)) == 0) << "\n";
  std::cout << (x == y) << "\n";
  return 0;
}

Prints

false
true

Even if -in turn- all the members are fundamental types I would prefer operator== and leave it to the compiler to consider optimizing the comparison into whatever assembly it considers to be preferable.

Churchlike answered 4/3, 2015 at 15:45 Comment(11)
You broke the precondition, despite leading with "even if the precondition is true".Degeneracy
Re "Even if you have POD structs where your precondition is true, you should probably use operator== since the operator could be overloaded for the member type", this is not possible.Twostep
Ok, Correct. I misinterpreted the condition to claim that it is possible to use both comparisons (thus not having any undefined behaviour) instead of having the same results for both.Churchlike
@Cheersandhth.-Alf struct B in the example is trivial and standard layout and therefore POD since struct A is trivial and standard layout. The overloaded operator== could still compare the pointers instead of the values and yield the same result as memcmp. -> It is possible but the question is probably more "premature optimization"-like than I thought ;)Churchlike
@Cheersandhth.-Alf: Why not?Degeneracy
@LightnessRacesinOrbit: I meant, it's not possible for the case where the operator overloads yields different results than memcmp, because the precondition is precisely that comparing with = yields same result as memcmp.Twostep
@Cheersandhth.-Alf: Okay, same thing I said then. I thought you were saying something about POD structs. Never mind! :)Degeneracy
Use of memory-comparison operators on floating-point numbers will define an equivalence relation; manual use of the == operator won't. Except in those rare cases where one doesn't want an equivalence relation, I would view a decision whether to use memory-based equivalence rather than == to be one of semantics rather than performance. It's entirely possible that == might be faster than using memcmp or equivalent, but performing wrong computations quickly generally isn't helpful.Gatefold
@Gatefold First of all, the precondition doesn't hold for floating points because operator== is not an equivalence relation as you say. Furthermore, I don't think that using operator== being a "wrong computation" in contrast to memcmp is very relevant. (I'd expect operator== behaviour anyway.) The fact that operator== is not an equvalence relation because of NaN is only one thing amongst many other important points one has to keep in mind when dealing with floating points.Churchlike
@Pixelchemist: Equivalence relations are absolutely fundamental to keyed collections. Having positive and negative zero compare as equal may sometimes be problematical, but wouldn't pose any major difficulty for keyed collections, since each of those values would compare equal to itself, and neither would compare equal to anything to which they other wouldn't also compare equal. Having objects compare unequal to themselves, however, is fundamentally broken. Code which repeatedly adds an object to a collection, does some stuff, removes it, adds another object, does some stuff, etc...Gatefold
...may run in O(1) space if objects compare equal to themselves, but melt down completely (failing to run in any quantity of bounded space) if objects don't.Gatefold
T
1

You imposed a very strong condition that there is no padding (I assume neither between the members of the class, nor inside these members). I presume that you also intended to exclude any "hidden" housekeeping data from the class. In addition the question itself implies that we always compare objects of exactly the same type. Under such strong conditions there's probably no way to come up with a counterexample that would make the memcmp-based comparison to differ from == comparison.

Whether it makes it worth it to use memcmp for performance reasons... well, if you really have a good reason to aggressively optimize some critical piece of code and profiling shows that there's improvement after switching from == to memcmp, then definitely go ahead. But I wouldn't use it a a routine technique for writing comparison operators, even if your class satisfies the requirements.

Tiffanitiffanie answered 4/3, 2015 at 15:50 Comment(0)
P
1

== is better, because memcmp compares pure memory data(comparing that way can be wrong in many situations, such as std::string, array-imitiating classes or types that can be equal even if they aren't perfectly identical). Since inside your classes there may be such types, you should always use their own operators instead of comparing raw memory data.

== is also better because it's more readable than some weird-looking function.

Phil answered 4/3, 2015 at 15:56 Comment(2)
A precondition given in the question is that the main reason you give does not apply.Degeneracy
Conversely, there are times when == is wrong and memcmp would be correct, such as when deciding whether a collection already contains an instance of struct Point {double x,y;} which is equivalent to another.Gatefold
E
1

C++20

Since C++20, this debate has become somewhat pointless, since you can explicitly default comparison operators. See also cppreference: Default comparisons (C++20).

If you are able to implement operator== in terms of std::memcmp, that also means you can implement it using = default;:

bool operator==(vector, vector) = default;

If a call to memcmp is the fastest possible implementation, the compiler will choose it for you, otherwise it won't.

You don't have to think about it anymore. Other answers cover the pre-C++20 sentiment well.

Elman answered 20/12, 2023 at 15:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.