Finding minimum element based on a transformed value
Asked Answered
E

5

16

Here is the task came to me from a code review. I want to select a minimum value from a set, based on a special kind of compare predicate. Like this:

struct Complex { ... };

float calcReduction(Complex elem);

Complex findMinValueWithPredicates(const std::vector<Complex>& values)
{
  auto it = std::min_element(values.begin(), values.end(), 
                             [](const Complex& a, const Complex& b) { 
                               return calcReduction(a) < calcReduction(b); 
                             });

  if (it == values.end()) throw std::runtime_error("");

  return *it;
}

Here I find the minimum element based on a predicate. This predicate computes a reduction of both values to float and then compares those floats. Works fine, looks neat.

Can you see the problem? Yes, for a set of N elements calcReduction() is called 2N times, while it is enough to compute it only N times - once for each element.

One way to solve this problem is to write explicit computations:

Complex findMinValueExplicit(const std::vector<Complex>& values)
{
  float minReduction = std::numeric_limits<float>::max();
  Complex minValue;

  for (Complex value : values)
  {
    float reduction = calcReduction(value);
    if (reduction < minReduction)
    {
      minReduction = reduction;
      minValue = value;
    }
  }

  if (minReduction == std::numeric_limits<float>::max()) throw std::runtime_error("");

  return minValue;
}

It works fine and we only have N calls to calcReduction(). However, it looks too verbose and the intent is not such clear, as compared to explicit call of min_element. Because when you call min_element it is really easy to guess you are going to find a minimum element, you know.

The only idea I have for now is to create my own algorithm like min_element_with_reduction, accepting a range and a reduction function. Sounds reasonable, but I wonder whether there are any ready solutions.

Any ideas on how to solve this task with clear intent and some ready solutions? Boost is welcomed. C++17 and ranges are interesting to see.

Elonore answered 16/10, 2015 at 16:12 Comment(2)
In order to simplify the code, I'd start off with checking values.empty() first. Then, init with the first element and enter the loop over the remaining elements. No, this probably isn't as obvious as your current code, but if you optimize this code for maximum speed (you have profiled this, right?) some compromises are IMHO acceptable, especially if the resulting code is properly explained in comments.Emileemilee
@UlrichEckhardt I think you are right, the only reasonable thing to do is to define my own algorithm. Other solutions look totally... complicated.Elonore
L
7

You could use boost::range library.

auto reductionLambda = [](const Complex& a) { return calcReduction(a); };
auto it = boost::range::min_element(values | boost::adaptors::transformed( 
                             std::ref(reductionLambda));

Ranges themselves should be coming to the standard C++ with C++17 as well.

Edit

As we figured out in comments, this would also make the conversion twice.

So here's something fun:

#include <boost/iterator/iterator_adaptor.hpp>
#include <boost/assign.hpp>
#include <algorithm>
#include <iostream>
#include <vector>
#include <functional>


template <class Iterator, class UnaryFunction>
class memoizing_transform_iterator
  : public boost::iterator_adaptor<
        memoizing_transform_iterator<Iterator, UnaryFunction> // Derived
      , Iterator                                              // Base
      , std::decay_t<decltype(std::declval<UnaryFunction>()(std::declval<typename Iterator::value_type>()))> // Value
      , boost::forward_traversal_tag    // CategoryOrTraversal
    >
{
 public:
    memoizing_transform_iterator() {}

    explicit memoizing_transform_iterator(Iterator iter, UnaryFunction f)
      : memoizing_transform_iterator::iterator_adaptor_(iter), fun(f) {}

    static int total;
 private:
    friend class boost::iterator_core_access;
    void increment() { ++this->base_reference(); memoized = false; }

    using MemoType = std::decay_t<decltype(std::declval<UnaryFunction>()(std::declval<typename Iterator::value_type>()))>;      

    MemoType& dereference() const 
    {
        if (!memoized) {
            ++total;
            memoized = true;
            memo = fun(*this->base());
        }
        return memo;
    }

    UnaryFunction fun;
    mutable bool memoized = false;
    mutable MemoType memo;
};


template <class Iterator, class UnaryFunction>
auto make_memoizing_transform_iterator(Iterator i, UnaryFunction&& f)
{
    return memoizing_transform_iterator<Iterator, UnaryFunction>(i, f);
}



template<class I, class U>
int memoizing_transform_iterator<I, U>::total = 0;


// THIS IS COPIED FROM LIBSTDC++
template<typename _ForwardIterator>
   _ForwardIterator
     min_el(_ForwardIterator __first, _ForwardIterator __last)
     {
       if (__first == __last)
     return __first;
       _ForwardIterator __result = __first;
       while (++__first != __last)
     if (*__first < *__result)
       __result = __first;
       return __result;
     }


int main(int argc, const char* argv[])
{
    using namespace boost::assign;

    std::vector<int> input;
    input += 2,3,4,1,5,6,7,8,9,10;


    auto transformLambda = [](const int& a) { return a*2; };


    auto begin_it = make_memoizing_transform_iterator(input.begin(), std::ref(transformLambda));
    auto end_it = make_memoizing_transform_iterator(input.end(), std::ref(transformLambda));
    std::cout << *min_el(begin_it, end_it).base() << "\n";

    std::cout <<begin_it.total;

    return 0;
}

Basically I implemented an iterator that memoizes the result of calling the transformation functor. The weird part though is that at least in online compilers, the iterators are copied before their dereferenced values are compared (thus defeating the purpose of memoizing). However when I simply copied the implementation from libstdc++, it works as expected. Perhaps you could try it out on a real machine? The live example is here.

Small edit: I tested on VS2015 and it works as expected with std::min_element.

Libertinism answered 16/10, 2015 at 16:22 Comment(9)
Or boost::transform_iterator if you don't like ranges, and/or need to write it yourself (iterators being marginally easier to write than ranges, in that they are usually a precondition).Borghese
*it will return reduced value and there is no way to get underlying Complex value. Besides, I think there will be 2N computations here too, as in my first code snippet.Elonore
@Elonore You can access the original value by using base() on the returned iterator. However, you likely right about the recomputation. Hmmm.Libertinism
Ok, I see your basic idea is the same as from @Yakk. But you have the full implementation. Thank you for this. Though I think I won't use it in real code :) P.S. I believe you forgot std::forward in make_memoizing_transform_iterator() function.Elonore
"also, it's actually incorrect syntax - boost ranges can't take lambdas..." What? Of course they can. You may need to update Boost.Apomixis
@Apomixis Well, actually, it just has the same problems with lambda's deleted copy constructor which can be solved with reference_wrapper (tested with coliru, boost version there is 1.59). I edited my answer to reflect this. Thanks!Libertinism
All non-copyable functors will have this problem (and solution), it is not specific to non-copyable lambdas. Also, the reductionLambda you've shown is not non-copyable, it should work just fine without ref; compiler errors indicating it is indicate bugs in said compiler. ;-]Apomixis
@Apomixis Indeed, the compiler actually complains about deleted assignment operator, not copy constructor, due to use of optional (I assume in default_constructible_unary_fn_wrapper). Doesn't really work without ref :)Libertinism
@Libertinism How to check if it is valid or not?Demission
B
4

Here's a solution using (already works with the range-v3 library, the implementation by the author of the upcoming Ranges TS)

#include <range/v3/all.hpp>
#include <iostream>
#include <limits>

using namespace ranges::v3;

int main()
{
    auto const expensive = [](auto x) { static int n; std::cout << n++ << " "; return x; };
    auto const v = view::closed_iota(1,10) | view::transform(expensive); 

    auto const m1 = *min_element(v);
    std::cout << "\n" << m1 << "\n";

    auto const inf = std::numeric_limits<int>::max();    
    auto const min = [](auto x, auto y) { return std::min(x, y); };

    auto const m2 = accumulate(v, inf, min);
    std::cout << "\n" << m2 << "\n";    
}

Live On Coliru with output:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 
1
19 20 21 22 23 24 25 26 27 28 
1

As you can see, using min_element takes 2N comparisons, but using accumulate only N.

Blameful answered 2/1, 2016 at 20:18 Comment(3)
You only need N-1 comparisons to solve this problem.Borghese
Sadly, I cannot compile it in VS2015 with clang.Elonore
@Elonore VS2015 will receive Update2 soon, perhaps it will work then. I have no experience with getting clang to work under VS2015.Blameful
B
3

The only thing missing is the meta-iterator.

A meta-iterator takes an iterator, and creates an iterator that contains a copy of it. It passes all operations through to the contained iterator, except when dereferenced returns a copy of the contained iterator instead.

With any care, the code used for this also works to create an iterator over size_t or int or similar torsor-likes.

template<class It, class R>
struct reduced_t {
  It it;
  R r;
  friend bool operator<( reduced_t const& lhs, reduced_t const& rhs ) {
    return lhs.r < rhs.r;
  }
};
template<class It, class F>
reduced_t<It, std::result_of_t<F(typename std::iterator_traits<It>::reference)>>
reducer( It it, F&& f ) {
  return {it, std::forward<F>(f)(*it)};
}

template<class It, class F>
It reduce( It begin, It end, F&& f ) {
  if (begin==end)
    return begin;

  return std::accumulate(
    meta_iterator(std::next(begin)), meta_iterator(end),
    reducer(begin, f),
    [&](
      auto&& reduced, // reduced_t<blah...> in C++11
      It i
    ) {
      auto r2 = reducer( i, f );
      return (std::min)(reduced, r2);
    }
  ).it;
};

f(*it) is called exactly once per iterator.

I wouldn't call this ... obvious. The trick is that we use accumulate and meta-iterators to implement min_element, then we can have accumulate operate on transformed elements (which gets called once, and returned).

You could rewrite it in stack-based programming style using primitives, but there are lots of primitives to write. Maybe post ranges-v3.


At this point, I'm imagining having some high-powered compositional programming library. If I did, we could do the following:

reducer( X, f ) can be rewritten graph( deref |then| f )(X) using order_by( get_n_t<1> ) for ordering.

The accumulate call could read accumulate( skip_first(range), g(begin(range)), get_least( order_by( get_n_t<1> ) ) ).

Not sure if that is any clearer.

Borghese answered 16/10, 2015 at 18:37 Comment(2)
Not sure I will use it, but the approach with iterator storing results of reduction looks natural. Thanks.Elonore
why wait for the high-powered compositional library, ranges-v3 already works as you can see in my answer :)Blameful
L
2

If you take a minElem as a lambda parameter you could use min_element

Complex findMinValueWithPredicates(const std::vector<Complex>& values)
{
  float minElem = std::numeric_limits<float>::max();
  auto it = std::min_element(values.begin(), values.end(),
                             [&minElem](const Complex& a, const Complex& b) {
                               float tmp = calcReduction(a);
                               if (tmp < minElem) {
                                  minElem = tmp;
                                  return true;
                               }
                               return false;
                             });

  if (it == values.end()) throw std::runtime_error("");

  return *it;
}

Edit: Why does this work when bis not used? 25.4.7.21 min_element

21 Returns: The first iterator i in the range [first,last) such that for every iterator j in the range [first,last) the following corresponding conditions hold: !(*j < *i) or comp(*j, *i) == false. Returns last if first == last.

because b should have been named smallestYet (code from cplusplus.com)

template <class ForwardIterator>
  ForwardIterator min_element ( ForwardIterator first, ForwardIterator last )
{
  if (first==last) return last;
  ForwardIterator smallest = first;

  while (++first!=last)
    if (*first<*smallest)    // or: if (comp(*first,*smallest)) for version (2)
      smallest=first;
  return smallest;
}

Which lead me to a new favourite quote:

"There are only 10 hard problems in Computer Science: cache invalidation, naming things and off-by-one errors."

  • one commented on that we might be off-by-one as we don't use b.
  • I worried that the minElem cached might not be correct.
  • And I realized that the name b should have been more meaningful as it is "dereferenced iterator to smallest element yet" or smallestYet.
  • Finally that not all understand binary when its not written with a ´b´ at the end.
Lodi answered 16/10, 2015 at 16:53 Comment(8)
This is effectively my second code snipped, wrapped in min_element. I don't think it is more readable than a range-based for.Elonore
Consider values = { std::numeric_limits<float>::max() } is there any functional difference?Lodi
Agree, your implementation will handle this case, while my won't. +1 for that.Elonore
And it is confusing to ignore b.Dovetail
Yes it is, but I could find no other std::algorithm that returned an iterator to a specific element as result of single parameter operation while running over all elements.Lodi
Does this implementation check the last item of values? In other words, does it work when values has two items only?Duncandunce
Yes surprisingly, tested it with 1, 2 and 4 items.Lodi
@ChronoTrigger, After looking at the standard and code its not unsurprising, so I've added a bit more to the answer.Lodi
A
2

Here is another option, but it is still effectively your second solution. To be honest it doesn't look clear, but someone might like it. (I use std::pair<float, Complex> to store reduction result and the value that was reduced.)

std::pair<float, Complex> result{std::numeric_limits<float>::max(), {}};
auto output_function = [&result](std::pair<float, Complex> candidate) {
    if (candidate.first < result.first)
        result = candidate;
};
std::transform(values.begin(), values.end(), 
               boost::make_function_output_iterator(output_function),
               [](Complex x) { return std::make_pair(calcReduction(x), x); });

P.S. If your calcReduction costs a lot, have you considered caching results in Complex objects? It will lead to a slightly more complicated implementation, but you'll be able to use plain std::min_element which makes your intentions clear.

Aubert answered 18/10, 2015 at 1:31 Comment(1)
Looks nice, I like this approach. Actually, the guy who originally implemented this code, also used pair, but was storing all of them in a vector, and then applying min_element.Elonore

© 2022 - 2024 — McMap. All rights reserved.