Three-way comparison of CRTP-ed std::vectors
Asked Answered
C

2

7

In this article I've stumbled upon this obscure code (which is presented casually, like this is a totally normal C++ code):

struct Tree : std::vector<Tree> {};

Two trees are then created and compared (see the demo):

Tree tree(size_t h)
{
  Tree s;
  if (h > 0)
    s.insert(s.end(), 2, tree(h - 1));
  return s;
}

int main()
{
  size_t h = 15;
  Tree s1 = tree(h);
  Tree s2 = tree(h);
  clock_t t = clock();
  s1 < s2;
  double d = clock() - t;
  d /= CLOCKS_PER_SEC;
  std::cout << d << std::endl; // 4.1s
}

After some thinking, I've realized this comparison seems legal and will always yield false.

The idea of the article is that since C++17 implements comparisons of vectors with std::lexicographical_compare, which compares corresponding elements a and b in two sequences as both a<b and b<a (see "Possible implementation" section on cpppreference), the performance for deep structures like the Tree above will degrade quadratically on the depth. And indeed it does.

The article also says that three-way comparison does not have such a problem. But wait, that's exactly what emerged in C++20 with operator<=>. But here is the caveat.

This code does not compile in C++20:

/opt/compiler-explorer/gcc-snapshot/lib/gcc/x86_64-linux-gnu/11.0.1/../../../../include/c++/11.0.1/compare:914:35: fatal error: recursive template instantiation exceeded maximum depth of 1024
        = decltype(__detail::__synth3way(std::declval<_Tp&>(),

Here is my question. Is this compile error a bug in the compiler, or is this code actually ill-formed? If the code is ill-formed, is it so only for C++20 or for both C++17 and C++20? (The latter would imply that this code compiles under C++17 only by chance).

Cadmann answered 22/3, 2021 at 10:0 Comment(3)
Looks like compare_three_way in GCC 11 is broken, it can't even compile the example.Chapland
@Chapland No, that's correct. compare_three_way just has stricter requirements than simply checking for <=>: three_way_comparable_with also requires equality_comparable.Possessive
It’s exponential, not merely quadratic.Solorzano
P
2

The issue here is constraint recursion.

In C++17, vector<T>'s operator< had a precondition that < was defined for T, but libstdc++ didn't check this. Their implementation was basically:

template <typename T>
inline bool operator<(vector<T> const& a, vector<T> const& b) {
    return lexicographical_compare(a.begin(), a.end(), b.begin(), b.end());
}

This... just works.

But in C++20, vector<T> instead has operator<=>, which has a constraint:

template <typename T>
    // this checks either <=> or <
    requires synth_3way_comparable<T>
auto operator<=>(vector<T> const& a, vector<T> const& b) { ... }

The problem here is that: in order to figure out of Tree is three-way comparable, we have to figure out if we can call <=> on two Trees, which finds this operator<=>(vector<Tree>, vector<Tree>) candidate, which is only defined if Tree is three-way-comparable, so we have to figure out if we can call <=> on two Trees, which finds ...

So that... doesn't work, and can't work.

Unfortunately it's not even a question of: just adding == and <=> to Tree, since vector's <=> will still be a candidate and just asking the question of whether or not it is valid explodes.

As a result, this isn't a viable strategy for implementing comparisons in C++20. You won't be able to inherit from vector like this. You'll have to just add it as a member.

Possessive answered 22/3, 2021 at 14:1 Comment(9)
Similar to the other answer, this explains why the current implementation of the standard library fails, but strictly speaking, this does not imply the program is ill-formed. It is not clear to me why std::three_way_comparable_with cannot work on Tree in some other stdlib implementation.Cadmann
Just curious, why operator== on two vector<Tree> doesn't need such constraints as std::equality_comparable for the Tree. It seems standard treat operator==/std::ranges::equal slightly loose in C++20.Overshadow
@Cadmann Constraint recursion is ill-formed.Possessive
@康桓瑋 equality_comparable also requires != so that might break existing code.Possessive
Can you add a defaulted operator<=> to short-circuit the constraint? (Maybe default it outside the class? I heard someone allowed that…)Solorzano
@DavisHerring Add a default or member operator<=> for Tree is not work, but we can add a global operator<=> to make it compile, but this is pointless since s1 < s2 just call the global one once.Overshadow
@DavisHerring Problem is vector's is a non-member, so it'd always be a candidate, and attempting to instantiate that candidate blows up.Possessive
Constraint recursion is ill-formed, but is operator<=> for vectors required to have a constraint? I don't see one on in the standard eel.is/c++draft/vector.synCadmann
@Cadmann You have to look up what synth-three-way-result means.Possessive
P
1

This code is well-formed in C++17 and ill-formed in C++20.

Consider the following two 2-height tree:

struct Tree : std::vector<Tree> {};

int main() {
  Tree s1 = tree(1);
  Tree s2 = tree(1);
  s1 < s2;
}

In C++17, we just call the vector's operator<:

bool operator<(const vector<Tree>& x, const vector<Tree>& y) { 
  return std::lexicographical_compare(x.begin(), x.end(), y.begin(), y.end()); 
}

which will invoke the std::lexicographical_compare:

template <class InputIt1, class InputIt2>
bool lexicographical_compare(InputIt1 first1, InputIt1 last1,
                             InputIt2 first2, InputIt2 last2) {
  for (; (first1 != last1) && (first2 != last2); ++first1, (void)++first2) {
    if (*first1 < *first2) return true;
    if (*first2 < *first1) return false;
  }
  return (first1 == last1) && (first2 != last2);
}

In first comparison *first1 < *first2, *first1 is the left node of s1, and *first2 is the left node of s2, and their type is Tree, so we back to compare two Trees. At this time, we don't have the operator< for the Tree, so we back to call their base's operator< just like we first do, which is:

bool operator<(const vector<Tree>& x, const vector<Tree>& y) { 
  return std::lexicographical_compare(x.begin(), x.end(), y.begin(), y.end()); 
}

But this time, the base vector of the two nodes is actually empty, so in the deeper lexicographical_compare function, we just return false. And the second comparison *first2 < *first1 following the same procedure.

Until now, we complete the left node comparison, then we can ++first1, (void)++first2 to continue the right node comparison.

In C++20, the s1 < s2 rewrite as:

(s1 <=> s2) < 0

so we call the vector's operator<=>:

auto operator<=>(const vector<Tree>& x, const vector<Tree>& y) {
  return std::lexicographical_compare_three_way(
    x.begin(), x.end(), y.begin(), y.end(), std::compare_three_way()
  );
}

the compare_three_way() contrains the value_type of vector which is Tree must satisfied the std::three_way_comparable_with concept, but we don't define any operator<=> for the Tree, so we get Constraints not satisfied.

Postnatal answered 22/3, 2021 at 11:54 Comment(4)
Hmm, this explains why the current implementation of the standard library fails, but strictly speaking, this does not imply the program is ill-formed. It is not clear to me why std::three_way_comparable_with cannot work on Tree in some other stdlib implementation.Cadmann
Can you give an example of what you are not clear about?Overshadow
I don't like the phrase "we don't define any operator<=> for the Tree, so we get Constraints not satisfied". std::vector contains operator<=> member, Tree inherits this member and thus should satisfy std::three_way_comparable_with.Cadmann
Tree inherits std::vector<Tree> not std::vector, and the operator<=> of the std::vector<Tree> depend on the operator<=> of the Tree, if we don't have operator<=> for the Tree, we can not three_way_comparable_with std::vector<Tree> since it depends.Overshadow

© 2022 - 2024 — McMap. All rights reserved.