boost::adaptors::transformed followed by boost::adaptors::filtered calls function twice
Asked Answered
H

1

11

I'm trying to chain a boost::adaptors::transformed (let's call it map) to a boost::adaptors::filtered (let's call it filter) - the idea is to map a fun that returns a "Maybe" (in my case, a std::pair<bool, T>) over a range and output only part of the results. My first implementation:

define BOOST_RESULT_OF_USE_DECLTYPE // enable lambda arguments for Boost.Range
#include <boost/range/adaptor/filtered.hpp>
#include <boost/range/adaptor/transformed.hpp>

struct OnlyEven
{
    typedef int argument_type;
    typedef std::pair<bool, int> result_type;
    result_type operator()(argument_type x) const
    {
        std::cout << "fun: " << x << std::endl;
        return std::make_pair(x % 2 == 0, x);
    }
} only_even;

int main(int argc, char* argv[])
{
    auto map = boost::adaptors::transformed;
    auto filter = boost::adaptors::filtered;
    int v[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    auto s = v | map(only_even) | filter([](std::pair<bool, int> x)->bool{ return x.first; });
    for (auto i : s) {}
    return 0;
}

When I run this, I get:

fun: 1
fun: 2
fun: 2
fun: 3
fun: 4
fun: 4
fun: 5
fun: 6
fun: 6
fun: 7
fun: 8
fun: 8
fun: 9
fun: 10
fun: 10

Every time the predicate is true, fun is called twice. Is this expected behavior? Am I doing something wrong, or is/was this a bug in Boost (I'm using 1.48)?

Edit: I tried this on the trunk version of Boost and it still happens.

Heydon answered 9/11, 2012 at 2:2 Comment(0)
P
10

First time it is called when passed to your filter - during increment.

Second time it is called in your range-based-for - during dereference. It does not cache result.

I.e., just passing thru range:

++++++++++boost::begin(s);

gives:

fun: 1
fun: 2
fun: 3
fun: 4
fun: 5
fun: 6
fun: 7
fun: 8
fun: 9
fun: 10

Check implementation of filter_iterator (filtered is based on it). It doesn't do any caching.

What if the transformation is expensive?

filtered do not use knowladge where it's input comes from.

Caching of result would require incresing size of filtered iterators. Just think where cached result should be stored. It should be copied into some member of filtered iterator.

So, basically, there is trade-off between space for caching and count of dereferencing.


EDIT: I have made proof-of-concept of cached_iterator which caches result of dereference, and invalidates it on each advancing. Also, I have made corresponding range adaptor.

Here how it is used:

auto s = v | transformed(only_even) | cached | reversed | cached | flt | flt | flt | flt | flt | flt;

You should place cached in chain where you want to cache result.

live demo

#include <boost/range/adaptor/filtered.hpp>
#include <boost/range/adaptor/transformed.hpp>
#include <boost/range/adaptor/reversed.hpp>
#include <boost/range/adaptor/map.hpp>
#include <boost/range/algorithm.hpp>

#include <iostream>
#include <ostream>

// ____________________________________________________________________________________________ //

#include <boost/iterator/iterator_adaptor.hpp>
#include <boost/range/iterator.hpp>
#include <iterator>

namespace impl
{

template<typename Iterator>
class cached_iterator : public boost::iterator_adaptor<cached_iterator<Iterator>,Iterator>
{
    typedef boost::iterator_adaptor<cached_iterator,Iterator> super;
    mutable bool invalidated;
    mutable typename std::iterator_traits<Iterator>::value_type cached;    
public:
    cached_iterator() : invalidated(true) {}
    cached_iterator(const Iterator &x) : super(x), invalidated(true) {}

    typename std::iterator_traits<Iterator>::value_type dereference() const
    {
        if(invalidated)
        {
            cached = *(this->base());
            invalidated=false;
            return cached;
        }
        else
        {
            return cached;
        }
    }
    void increment()
    {
        invalidated=true;
        ++(this->base_reference());
    }
    void decrement()
    {
        invalidated=true;
        --(this->base_reference());
    }
    void advance(typename super::difference_type n)
    {
        invalidated=true;
        (this->base_reference())+=n;
    }
};

template<typename Iterator> cached_iterator<Iterator> make_cached_iterator(Iterator it)
{
    return cached_iterator<Iterator>(it);
}

template< class R >
struct cached_range : public boost::iterator_range<cached_iterator<typename boost::range_iterator<R>::type> >
{
private:
    typedef boost::iterator_range<cached_iterator<typename boost::range_iterator<R>::type> > base;
public:
    typedef R source_range_type;
    cached_range( R& r )
        : base( make_cached_iterator(boost::begin(r)), make_cached_iterator(boost::end(r)) )
    { }
};

template<typename InputRange>
inline cached_range<const InputRange> cache(const InputRange& rng)
{
    return cached_range<const InputRange>(rng);
}

template<typename InputRange>
inline cached_range<InputRange> cache(InputRange& rng)
{
    return cached_range<InputRange>(rng);
}

struct cache_forwarder{};

cache_forwarder cached;

template< class InputRange >
inline cached_range<const InputRange>
operator|( const InputRange& r, cache_forwarder )
{
    return cache(r);
}

template< class InputRange >
inline cached_range<InputRange>
operator|( InputRange& r, cache_forwarder )
{
    return cache(r);
}

} // namespace impl

// ____________________________________________________________________________________________ //


struct OnlyEven
{
    typedef int argument_type;
    typedef std::pair<bool, int> result_type;
    result_type operator()(argument_type x) const
    {
        std::cout << "fun: " << x << std::endl;
        return std::make_pair(x % 2 == 0, x);
    }
} only_even;

int main()
{
    using namespace impl;
    using namespace boost::adaptors;
    using namespace std;

    int v[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    auto flt =  filtered([](std::pair<bool, int> x)->bool{ return x.first; });

    auto s = v | transformed(only_even) | cached | reversed | cached | flt | flt | flt | flt | flt | flt;

    boost::copy(s | map_values, ostream_iterator<int>(cout,"\n") );
    return 0;
}

Output is:

fun: 10
10
fun: 9
fun: 8
8
fun: 7
fun: 6
6
fun: 5
fun: 4
4
fun: 3
fun: 2
2
fun: 1
President answered 9/11, 2012 at 2:21 Comment(2)
Yes, I just figured that out after looking through the code. Does that make sense for transformed, calling fun twice? Thinking about other languages (Python, Haskell, etc. it doesn't make sense). What if the transformation is expensive?Heydon
In that case, you may use "cached" adaptor as shown in answer.President

© 2022 - 2024 — McMap. All rights reserved.