Sorting two arrays using C++23 zip view
Asked Answered
B

3

8

There is a rather typical task of sorting two arrays simultaneously, assuming that same indexed elements of the arrays form virtual pairs, which are sorted. Such questions appear at least 10 years ago: boost zip_iterator and std::sort

Now this task can be solved using range-v3 library:

#include <array>
#include <range/v3/all.hpp>

int main() {
   auto x = std::array{ 3,   2,   4,   1 };
   auto y = std::array{'A', 'B', 'C', 'D'};
   ranges::sort( ranges::views::zip( x, y ) );
   // here x = {1,2,3,4}, y={'D','B','A','C'}
}

Online demo: https://gcc.godbolt.org/z/WGo4vGsx5

In C++23 std::ranges::zip_view appears, and my expectation was that the same program can be written using the standard library only:

#include <array>
#include <ranges>
#include <algorithm>

int main() {
   auto x = std::array{ 3,   2,   4,   1 };
   auto y = std::array{'A', 'B', 'C', 'D'};
   std::ranges::sort( std::views::zip( x, y ) );
}

Unfortunately, it results in long compilation errors. E.g. in GCC:

...
/opt/compiler-explorer/gcc-trunk-20221127/include/c++/13.0.0/bits/ranges_algo.h:54:31: error: no matching function for call to '__invoke(std::ranges::less&, std::pair<int, char>&, std::pair<int&, char&>)'
   54 |           return std::__invoke(__comp,
      |                  ~~~~~~~~~~~~~^~~~~~~~
   55 |                                std::__invoke(__proj, std::forward<_TL>(__lhs)),
      |                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   56 |                                std::__invoke(__proj, std::forward<_TR>(__rhs)));
      |                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

Online demo: https://gcc.godbolt.org/z/47xrzM6ch

Is it just because the implementations are not mature enough yet, or zip view in C++23 will not help to sort two array?

Besiege answered 27/11, 2022 at 8:15 Comment(1)
With p2165r4 the zip_view shouldn't use a std::pair anymore (instead always a std::tuple). This seems to not be implemented yet. With std::pair I am not sure whether this was meant to work. A std::pair to references (iter_reference_t<I>) is not comparable with a std::pair to values (iter_value_t<I>) without conversion, but std::ranges::sort requires that iter_reference_t<I> and iter_value_t<I>& are comparable.Radom
P
9

At least the trunk version of libc++ (llvm) supports this:

std::ranges::sort(std::views::zip(x, y), [](auto&& a, auto&& b) {
    return std::tie(std::get<0>(a), std::get<1>(a)) < 
           std::tie(std::get<0>(b), std::get<1>(b));
});

Demo

If you use three ranges instead of two, it works without having to supply a user-defined comparator function:

auto x = std::array{ 3,   2,   4,   1 };
auto y = std::array{'A', 'B', 'C', 'D'};
auto z = std::array{"Z", "Y", "X", "W"};

std::ranges::sort(std::views::zip(x, y, z));

Demo

I assume the version that zips two ranges will not need a user-defined comparator function when fully implemented.

Panfish answered 27/11, 2022 at 9:2 Comment(2)
Sidenote, you can create a tuple from a pair directly, without tie: std::tuple{some_pair}Engle
@Engle Yes, that's a nice simplification.Panfish
E
2

Following with @Ted Lyngmo's demo, I have created my own:

std::ranges::sort( zipped_xy, []<typename T, typename U>(T a, U b) {
    std::cout << __PRETTY_FUNCTION__ << '\t' << std::get<0>(a) << ':' << std::get<0>(b) << '\n';
    return std::tuple{a} < std::tuple{b};
});

link

Running this, should give you some output like:

<lambda(T, U)> [with T = std::pair<int&, char&>; U = std::pair<int&, char&>]    2:3
<lambda(T, U)> [with T = std::pair<int&, char&>; U = std::pair<int&, char&>]    4:2
<lambda(T, U)> [with T = std::pair<int, char>; U = std::pair<int&, char&>]      4:3
<lambda(T, U)> [with T = std::pair<int&, char&>; U = std::pair<int&, char&>]    7:2
 ⋮

The problem here is that after the second iteration returning false, it will try to pass a pair<int&, char&> as a pair<int, char>, which cannot be compared with a pair<int&, char&>, which is why we needed to provide a custom comparator.


Why does it create copies of the underlying objects? I don't know*. The good news is that with P2165R4, the value type for a zip_view will always be tuple<Ts&...>. Since tuple<Ts...> and tuple<Ts&...> can be compared directly, the sort will also be done automatically.

Note*: more discussion about this can be found in the comment.

Engle answered 27/11, 2022 at 10:41 Comment(5)
"Why does it create copies of the underlying objects? I don't know.": The typical sort implementation will do insertion sort for small ranges. For this it makes sense to move the element that is to be inserted into a local variable. And the std::sortable concept requires that the value is comparable with the reference, so that this should be allowed.Radom
@Radom I understand that it might create copies of the underlying objects of the zip_view(the pair). What I meant is I don't know why it would also copy the underlying object of the underlying object of zip_view(the int and char).Engle
Because copying the iter_reference_t (the pair with the references) would not move any of the actual values that are being sorted. The insertion sort will shift the other elements in the range, overwriting the current value. So it needs to be stored somewhere.Radom
@Radom Hmm, didn't realize sort(zip_view) will actually sort the underlying range, thought it was only done on the view.Engle
The point of a "view" is that it doesn't own any elements but just refers to the elements of the original range(s).Radom
N
2

In addition to passing in a custom comparator (as in other answers), you can pass in a custom projection function to project pairs into tuples:

ranges::sort(views::zip(x, y), {}, [](auto p) { return std::tuple(p); });

which will cast pair<int&, char&> into tuple<int&, char&>, making ranges::less work properly.

Demo

However, since C++23 changed the value_type of zip_view' iterator to always tuple (before that, zip_view will especially use pair as value_type when the number of range is 2), your code is well-formed in C++23.

The only thing you have to do is wait for the compiler to implement P2165.

Namara answered 27/11, 2022 at 14:0 Comment(2)
Thanks. Even shorter workaround would be to repeat one of the input arrays e.g. : std::ranges::sort( std::views::zip( x, y, y ) );. It works in GCC, but not in Clang: gcc.godbolt.org/z/rWd86E5Kb Is it expected to work everywhere?Besiege
@Besiege I don't think this is valid code, you zipped the same sequence and sorted, I suspect this is undefined behavior. This should be due to different algorithm implementations.Levkas

© 2022 - 2024 — McMap. All rights reserved.