operator< comparing multiple fields
Asked Answered
A

6

24

I have the following operator< that is supposed to sort first by a value, then by another value:

    inline bool operator < (const obj& a, const obj& b) 
    {
        if(a.field1< b.field1)
            return true;
        else
            return a.field2 < b.field2;
    }

I have the feeling this is incorrect and that you can't do that without another third comparaison test on the members variables, but I can't find any example where this doesn't work. So whould this really sort as expected? thanks

edit : I would have coded it as :

    inline bool operator < (const obj& a, const obj& b) 
    {
        if(a.field1< b.field1)
            return true;
                    else if(a.field1> b.field1)
            return false;
        else
            return a.field2 < b.field2;
    }

are there any differences? I'm asking because I know mine is correct from experience but also longer than the first one

Anking answered 3/7, 2012 at 13:52 Comment(13)
Why should it not work? Maybe you should explain your feeling better and also what it is "to sort as expected".Tarn
Your original fails with if( obj(2,1) < obj(1,2) ).Devote
So is that even a strict weak ordering? I would be fine with it if I'm at least sure I get no runtime errorAnking
That's a fairly common idiom for me - check every member but the last for < then for >, only continuing to later checks in the == case. Then for the final field you don't care about == vs. > so just return the < result. Personally, I'd write each if statement on a single line, and not bother with the else as return exits the function anyway.Tarrance
@Rob - What am I missing? That function should be OK as a non-member as far as I'm aware. Whether the field comparisons work should only depend on whether the field types have operator< and operator> defined. Actually - correction to my previous comment - I'll normally check for (a.f1 < b.f1) then (b.f1 < a.f1). IIRC you get an automatic operator> when you define operator< from the standard library, but this is OK even if that fails for some reason.Tarrance
@Steve314 Your solution is actually most clearly written using the ternary operator: return a.field1 != b.field1 ? a.field1 < b.field1 : a.field2 < b.field2; (but with line breaks, which I can't put into a comment).Lintwhite
@James - I disagree even for two fields, and what about when there's three, or four or more? My idiom extends to the larger cases and stays just as readable. Nesting ternary operators gets cluttered and unreadable very quickly. Ternary operators are concise, but that doesn't always mean more readable. Testing != then < may seem more symmetric and may give more even performance (by slowing down the < case to match the >), but neither of those is a compelling issue for me.Tarrance
@Steve314 The ternary operator does make it clearer: ternary operators chain just like ifs, so there's no difference in readability there. The difference is that by using the ternary operator, the returns aren't in ifs, and since there's only one branch, it's immediately clear that there isn't a branch where the return has been forgotten.Lintwhite
@James - that missing return is visually immediately obvious in a sequence of one-line if statements anyway, especially with vertical alignment. I didn't realise the ternary operator could chain without parenthesis clutter or risking wierd precedence-confusion bugs, though, so you've got me there.Tarrance
@Steve314 The difference is, admittedly, very slight in this case, where the only contents of the if/else branches is the return. (But you still have to verify that every if has an else.) As for chaining ternary operators, this should be a standard C++ idiom, immediately recognizable by any competent practitioner. But it's not; in fact, I don't know of any textbook which presents it. (John Potter first showed it to me, and I'd been using C++ for over 10 years at the time.) Chained ternary operators, correctly formatted (which I can't do in a comment) are very readable.Lintwhite
@James - I don't have to verify that every if has an else for a very simple reason - my idiom doesn't use else at all. Remember - I prefer not to have the dependency on !=. I don't have any nesting of if statements for this, just a chain of them.Tarrance
@Steve314 Now that does lead to confusion. If anything follows an if, I have a right as a reader to assume that it will be executed.Lintwhite
@James - not in a C family language, you don't. It's not just return - there's also break, continue, throw - even a rare special case where goto is justified. Sure, if the if statement was non-trivial, that return becomes a hidden exit, and I'd agree with you. But when the entire body within a one-line if statement is either return true; or return false;, there's no excuse for failing to see that.Tarrance
H
47

I'd like to do it all by myself..

You should only compare the values of Obj::field2 if the values of Obj::field1 are equal.

The easy-to-understand way:

/* This will meet the requirements of Strict-Weak-Ordering */

if (a.field1 != b.field1) return a.field1 < b.field1;
else                      return a.field2 < b.field2;

The correct (recommended) way:

The "correct" way of implementing it uses only operator< to compare the fields, the below looks more complicated than it really is.

It will however yield the same result as the easy-to-understand example previously written.

return a.field1 < b.field1 || (
  !(b.field1 < a.field1) && a.field2 < b.field2
);

There must be a way of implementing operator< without causing a lot of headache?

C++11 (and later)

You can use std::tuple from the STL which already have operator< for multiple fields defined, such as in the below example.

#include <utility>
inline bool
operator< (Obj const& lhs, Obj const& rhs)
{
  return std::tie (lhs.field1, lhs.field2) < std::tie (rhs.field1, rhs.field2);
}

C++03

If your compiler doesn't have support for C++11 yet and you only need to compare two fields from each object you could use std::pair instead.

The reason for std::make_pair is the same as in the previous example using std::tie.

#include <utility>
inline bool
operator< (Obj const& lhs, Obj const& rhs)
{
  return std::make_pair (lhs.field1, lhs.field2)
       < std::make_pair (rhs.field1, rhs.field2);
}

using std::pair will require copies of the members to be created, which in some circumstances is undesirable.

Is this really recommended practise?

See the below question/answers for more information, but to sum it up; the c++11 approach doesn't cause that much overhead and it's very simple to implement.

Hakluyt answered 3/7, 2012 at 13:55 Comment(8)
Or return (a.field1 == b.field1) ? a.field2 < b.field2 : a.field1 < b.field1;.Chela
Thanks. Does this mean that the first version is an improper order for keys in a std::map? (basically not a strict weak ordering, would cause runtime errors) or is it just a different ordering than the one I intended?Anking
@lezebulon: It's not a SWO, see the example in my answer.Ahn
@Anking Your original version doesn't meet the requirements for an ordering function, so your code has undefined behavior. Anything could happen, and a lot of strange things actually will.Lintwhite
The "canonical" way of writing this condition is return a.field1 < b.field1 || (!(b.field1 < a.field1) && a.field2 < b.field2). Canonical mainly because it only uses <; in practice, I don't find it that clear, and would only use it in the context of a library template (where I don't want to impose any additional constraints on the types other than those required by the standard library).Lintwhite
That first sentence doesn't sound right. I think you meant '... should only compare field2 when the two field1 values are equal.' You appear to have the opposite of that.Bawdry
The one thing you have to be concerned about with tuples is the string fields because it results in two operators being called. By using the string::compare you only need to call this once to determine < , ==, >Unitarian
This one is really sneaky, as b and a has switched places !(b.field1 < a.field1). If == is available, I'd use that.Rumanian
B
9

As an "example where this doesn't work", think of what happens if a.field1 > b.field1 but a.field2 < b.field2, such as:

a = (10, 20)
b = ( 5, 30)

Clearly, a is the greater value here but, when running that through your code:

if (a.field1 < b.field1)         // 10 < 5? No, move to else.
    return true;
else
    return a.field2 < b.field2;  // 20 < 30? Yes, return true.

In that circumstance, you compare based solely on field2 which is not what you want. You only want to bring field2 into play where the field1 fields are equal, so what you're looking for is something like:

if (a.field1 < b.field1)
    return true;
if (b.field1 < a.field1)
    return false;
return a.field2 < b.field2;

Or, the slightly more succinct, if you also have an operator== (I mean for your field1/2 types which are probably primitives, not your a/b types which are not):

if (a.field1 == b.field1)
    return a.field2 < b.field2;
return a.field1 < b.field1;
Bawdry answered 3/7, 2012 at 13:56 Comment(2)
And of course the second line can be written as if b.field1 < a.field1: return false if you want to use operator< only.Carrigan
@Jens: I assumed that the field1/2 types were most likely primitives but you're correct, that's not necessarily the case. Hence I've updated the answer to suit.Bawdry
A
4

No. You need to also catch (a.field1 > b.field1).

This is not a strict weak ordering, because it would give (1,2) < (2,1), but also (2,1) < (1,2).

Ahn answered 3/7, 2012 at 13:54 Comment(0)
C
2

Here's a version that relies on the logical short-circuit rule to avoid explicit branching

template<typename T>
bool operator< (T const& a, T const& b)
{
        return (
                 ( a.field1 < b.field1 ) || (( a.field1 == b.field1 ) &&
                 ( a.field2 < b.field2 ))
        );
}

This assumes that your primitive type of field1 has an operator==. It becomes tedious to type this for more than 2 fields, but you could use std::lexicographical_compare if your class obj stores the fields inside an std::array<T, N> for some type T and size N

template<typename T, int N>
struct obj
{
    std::array<T, N> field;
};

bool operator< (obj const& a, T const& b)
{
        return std::lexicographical_compare(
            a.field.begin(), a.field.end(), 
            b.field.begin(), b.field.end()
        );
}

Note that there is a draft paper N3326 that discusses adding operators == and < automatically for class types.

Carouse answered 3/7, 2012 at 14:41 Comment(2)
Why are you avoiding explicit branching? I don't think it's any slower than logical short-circuts.Lanita
@MooingDuck Haven't actually tested this, and it might even differ per platform. Just wrote it as a curiosity.Carouse
U
1

You can use variadic templates in c++11 or later

template<typename T>
bool less_than( const T& a, const T& b )
{
    return a < b;
}

template<typename T, typename... Args>
bool less_than( const T& a, const T& b, Args... args )
(
    if ( a < b )
          return true;
    else if ( b < a )
          return false;
    else
          return less_than(  args...  );
)

Then you would call as

return less_than(a.x,b.x,
                 a.y,b.y,
                 a.z,b.z);

It supports any number of fields or types as long as type has < operator. You can mix types.

Unitarian answered 11/10, 2019 at 17:46 Comment(4)
This doesn't give a valid strict weak ordering. Also, there isn't really a reason to implement this yourself when you can just use std::tie(a.x,a.y,a.z) < std::tie(b.x,b.y,b.z).Pod
Can you give example of why doesn’t do strict weak ordering?Unitarian
Well, you edited it since my comment so it seems OK now. The previous version was functionally identical to a.x < b.x || a.y < b.y || ... so it would fail e.g. for a={2,1,..}, b={1,2,...}.Pod
Okay got it. One thing is in practice I don’t use this one but a slightly modified that uses a generic Compare that returns -1,0 or 1 to avoid having to do two operator calls for string fields. I think the tuple / tie method is simpler / better if it can also take advantage of a compare for strings so strings don’t have to be iterated twice on.Unitarian
A
0

My method described below involves some macros, but still useful in many cases. Maybe something like this can be also done with inline functions.

#define CMP_LT2(a, b) ((a) < (b) ? (a) : (b))
#define CMP_GT2(a, b) ((a) > (b) ? (a) : (b))
#define CMP_LTE2(a, b) ((a) <= (b) ? (a) : (b))
#define CMP_GTE2(a, b) ((a) >= (b) ? (a) : (b))
#define CMP_EQ2(a, b) ((a) == (b))
#define CMP_NEQ2(a, b) ((a) != (b))
#define CMP_LT3(a, b, c) (CMP_EQ2(a, b) ? (c) : CMP_LT2(a, b))
#define CMP_GT3(a, b, c) (CMP_EQ2(a, b) ? (c) : CMP_GT2(a, b))
#define CMP_LTE3(a, b, c) (CMP_EQ2(a, b) ? (c) : CMP_LT2(a, b))
#define CMP_GTE3(a, b, c) (CMP_EQ2(a, b) ? (c) : CMP_GT2(a, b))
#define CMP_EQ3(a, b, c) ((a) == (b) ? (c) : false)
#define CMP_NEQ3(a, b, c) ((a) != (b) ? true : (c))

Then assume you have:

struct Point3D {
    double x;
    double y;
    double z;
};

And then you write:

struct Point3D {
    double x;
    double y;
    double z;

    bool operator<(const Point3D& other) const noexcept
    {
        return CMP_LT3(z, other.z,
               CMP_LT3(y, other.y,
               CMP_LT2(x, other.x)));
    }
};
Apograph answered 17/3, 2019 at 19:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.