So to illustrate the points I tried to make in the comments to the other answer, let's write something like your transform:
A Direct Implementation
I skipped the notion of iterators and standard library, since it is encumbered with the whole "iterator value type must be fixed" and other burden.
Instead, let's do it "functionally".
#include <boost/pfr.hpp>
namespace pfr = boost::pfr;
template <typename Op, typename... T>
void transform(Op f, T&&... operands) {
auto apply = [&]<int N>() {
f(pfr::get<N>(std::forward<T>(operands))...);
return 1;
};
constexpr auto size = std::min({pfr::tuple_size<std::decay_t<T>>::value...});
// optionally assert that sizes match:
//static_assert(size == std::max({pfr::tuple_size<std::decay_t<T>>::value...}));
[=]<auto... N>(std::index_sequence<N...>) {
return (apply.template operator()<N>() + ...);
}
(std::make_index_sequence<size>{});
}
I already generalized a bit by not making the arity fixed. It's more like an n-ary zip or visitor now. To get the transform you wanted you'd pass it an operation like
auto binary = [](auto const& a, auto& b) {
b = a;
};
Let's demo this, highlighting mixed type members, asymmetric types as well as mixed-length structs:
struct S1 { int a; double b; long c; float d; };
struct S2 { double a; double b; double c; double d; };
struct S3 { double a; double b; };
Test cases:
int main() {
auto n_ary = [](auto&... fields) {
puts(__PRETTY_FUNCTION__);
return (... = fields);
};
S1 a;
S2 b;
S3 c;
// all directions
transform(binary, a, b);
transform(binary, b, a);
// mixed sizes
transform(binary, b, c);
transform(binary, c, a);
// why settle for binary?
transform(n_ary, a, b);
transform(n_ary, a, b, c);
transform(n_ary, c, b, a);
}
See it Live On Compiler Explorer
Already, the disassembly suggests everything is getting inlined and optimized away. Literally. Only the puts
calls remain:
main:
sub rsp, 8
mov edi, OFFSET FLAT:.LC0
call puts
mov edi, OFFSET FLAT:.LC1
call puts
mov edi, OFFSET FLAT:.LC2
call puts
...
...
xor eax, eax
add rsp, 8
ret
giving the output
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = int; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = long int; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = float; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = int]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = long int]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = float]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = int]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(auto:14& ...)> [with auto:14 = {int, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {double, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {long int, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {float, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {int, double, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {double, double, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {double, double, int}]
main()::<lambda(auto:14& ...)> [with auto:14 = {double, double, double}]
Proof Of Work
Let's do some "useful" calculations that we can check. Also, renaming the transform
function to nway_visit
just reflect its more generic orientation:
auto binary = [](auto& a, auto& b) { return a *= b; };
auto n_ary = [](auto&... fields) { return (... *= fields); };
So both operations do a right-fold multiply-assigning. Given some distinctly chosen initializers
S1 a {1,2,3,4};
S2 b {2,3,4,5};
S3 c {3,4};
we want to be able to see the data flow through the data structures. So, let's optionally do some debug tracing of that:
#define DEMO(expr) \
void(expr); \
if constexpr (output_enabled) { \
std::cout << "After " << std::left << std::setw(26) << #expr; \
std::cout << " a:" << pfr::io(a) << "\tb:" << pfr::io(b) \
<< "\tc:" << pfr::io(c) << "\n"; \
}
DEMO("initialization");
// all directions
DEMO(nway_visit(binary, a, b));
DEMO(nway_visit(binary, b, a));
// mixed sizes
DEMO(nway_visit(binary, b, c));
DEMO(nway_visit(binary, c, a));
// why settle for binary?
DEMO(nway_visit(n_ary, a, b));
DEMO(nway_visit(n_ary, a, b, c));
DEMO(nway_visit(n_ary, c, b, a));
return long(c.a + c.b) % 37; // prevent whole program optimization...
As the cherry-on-top let's be absolutely sure that (with output disabled) the compiler cannot optimize the whole program away because there are no observable effects:
return long(c.a + c.b) % 37; // prevent whole program optimization...
The demo is Live On Compiler Explorer with output enabled, and once with output disabled showing the disassembly:
main:
mov eax, 13
ret
WOW
Holy smokes. That's optimization. The whole program is statically evaluated and just returns the exit code 13. Let's see whether that's the correct exit code:
Output Enabled:
After "initialization" a:{1, 2, 3, 4} b:{2, 3, 4, 5} c:{3, 4}
After nway_visit(binary, a, b) a:{2, 6, 12, 20} b:{2, 3, 4, 5} c:{3, 4}
After nway_visit(binary, b, a) a:{2, 6, 12, 20} b:{4, 18, 48, 100} c:{3, 4}
After nway_visit(binary, b, c) a:{2, 6, 12, 20} b:{12, 72, 48, 100} c:{3, 4}
After nway_visit(binary, c, a) a:{2, 6, 12, 20} b:{12, 72, 48, 100} c:{6, 24}
After nway_visit(n_ary, a, b) a:{24, 432, 576, 2000} b:{12, 72, 48, 100} c:{6, 24}
After nway_visit(n_ary, a, b, c) a:{1728, 746496, 576, 2000} b:{12, 72, 48, 100} c:{6, 24}
After nway_visit(n_ary, c, b, a) a:{1728, 746496, 576, 2000} b:{12, 72, 48, 100} c:{124416, 1289945088}
So, the return value should be (124416 + 1289945088) modulo 37
, which a pocket calculator confirms is 13.
From Here: Parallel tasks etc.
Your original motivation included getting the parallel execution options from standard library for free. As you know I'm skeptical of the usefulness of that.
However, little stops you from getting this behaviour from thie algorithm:
boost::asio::thread_pool ctx; // or, e.g. system_executor
auto run_task = [&](auto&... fields) {
boost::asio::post(ctx, [=] { long_running_task(fields...); });
};
Hope this is good inspiration. And thanks for making me look at PFR. It's pretty sweet.
a
will be anstd::variant
. – Oloroso