Lazy transform in C++
Asked Answered
M

5

11

I have the following Python snippet that I would like to reproduce using C++:

from itertools import count, imap

source = count(1)
pipe1 = imap(lambda x: 2 * x, source)
pipe2 = imap(lambda x: x + 1, pipe1)
sink = imap(lambda x: 3 * x, pipe2)
for i in sink:
    print i

I've heard of Boost Phoenix, but I couldn't find an example of a lazy transform behaving in the same way as Python's imap.

Edit: to clarify my question, the idea is not only to apply functions in sequence using a for, but rather to be able to use algorithms like std::transform on infinite generators. The way the functions are composed (in a more functional language like dialect) is also important, as the next step is function composition.

Update: thanks bradgonesurfing, David Brown, and Xeo for the amazing answers! I chose Xeo's because it's the most concise and it gets me right where I wanted to be, but David's was very important into getting the concepts through. Also, bradgonesurfing's tipped Boost::Range :).

Max answered 30/10, 2012 at 17:14 Comment(6)
I got this far and realised it wasn't going to happen easily: Python... reproduce in C++... itertools.Trencher
Nice; this can certainly be done in C++, though it won't be quite as short.Rachellrachelle
Small comment on your original code - it would be easier read, in my opinion, with generator expressions rather than imap, since you're only taking values from one iterator at a time. pipe1 = (2*x for x in source).Predicable
Moreover, your code would be better (simpler, faster, etc) as a single expression, rather than lots of nested iterators. sink = (3 * (2*x + 1) for x in count(1))Predicable
Yes, generator expressions would be simpler to read, @poorsod (and more 'Pythonic'). This is just a simple example to illustrate what I want to accomplish in C++. Please note that the resulting 'pipeline' would be (way) more complex :).Max
Yes, I know it won't be as short, @KerrekSB. I already accomplished writing an infinite number generator (that can be generalized for any infinite generator) - now I need to be able to use it.Max
C
14

Employing Boost.Range:

int main(){
  auto map = boost::adaptors::transformed; // shorten the name
  auto sink = generate(1) | map([](int x){ return 2*x; })
                          | map([](int x){ return x+1; })
                          | map([](int x){ return 3*x; });
  for(auto i : sink)
    std::cout << i << "\n";
}

Live example including the generate function.

Cynara answered 30/10, 2012 at 18:0 Comment(7)
That's beautiful, Xeo! And short! What are you using to compile the example, though? I'm getting errors with gcc 4.6.1 and boost 1.51 (using -std=c++0x).Max
I replaced using generate_range = boost::iterator_range<generate_iterator>; with typedef boost::iterator_range<generate_iterator>; and it works :). What kind of dialect is the first one, Xeo?Max
@bruno: C++11 dialect. ;) GCC 4.6 didn't implement using aliases, it seems. And I used GCC 4.7.2, as can be seen in the example link.Cynara
This is very cool with one small failing. Its pretty hard to figure out the type of sink. Thus if you want to return it from a function you need to declare the type. I came across a polymorphic range/iterator library once that erases the complex type. There was a small performance hit.Atomism
@brad: If you're using normal functors, the trailing-return-type works wonders here. Sadly, lambdas aren't allowed inside of decltype, so it's not applicable here. What you can use, though, is boost::any_range.Cynara
Yes boost::any_range is exactly what i was talking about. It wasn't in the lib last time i used boost and had to roll my own. Good to know.Atomism
With Boost.Lambda it becomes even prettier:" | map( x * 2 ) | map( x + 1 ) ". liveworkspace.org/code/d66be4d7b00c78af809780414fb09f08Blighter
V
5

I think the most idiomatic way to do this in C++ is with iterators. Here is a basic iterator class that takes an iterator and applies a function to its result:

template<class Iterator, class Function>
class LazyIterMap
{
private:
    Iterator i;
    Function f;
public:
    LazyIterMap(Iterator i, Function f) : i(i), f(f) {}
    decltype(f(*i)) operator* () { return f(*i); }
    void operator++ () { ++i; }
};

template<class Iterator, class Function>
LazyIterMap<Iterator, Function> makeLazyIterMap(Iterator i, Function f)
{
    return LazyIterMap<Iterator, Function>(i, f);
}

This is just a basic example and is still incomplete as it has no way to check if you've reached the end of the iterable sequence.

Here's a recreation of your example python code (also defining a simple infinite counter class).

#include <iostream>

class Counter
{
public:
    Counter (int start) : value(start) {}
    int operator* () { return value; }
    void operator++ () { ++value; }
private:
    int value;
};

int main(int argc, char const *argv[])
{
    Counter source(0);
    auto pipe1 = makeLazyIterMap(source, [](int n) { return 2 * n; });
    auto pipe2 = makeLazyIterMap(pipe1, [](int n) { return n + 1; });
    auto sink = makeLazyIterMap(pipe2, [](int n) { return 3 * n; });
    for (int i = 0; i < 10; ++i, ++sink)
    {
        std::cout << *sink << std::endl;
    }
}

Apart from the class definitions (which are just reproducing what the python library functions do), the code is about as long as the python version.

Vmail answered 30/10, 2012 at 17:54 Comment(4)
Apart from your exapmle typo, this looks very interesting (and simple). I wonder if a (possibly Boost) library exists that allow me to do this (and possibly much more :)). I'm looking into Boost::Rangex right nowm, as bradgonesurfing suggested.Max
@brunonery boost's transform iterator is probably worth looking into.Vmail
I looked into transform iterators and they are very promising. However, the fact that you have to specify the input iterator type confuses me. How would you specify a transform_iterator that works on another transform_iterator? Like rewriting this example using transform_iterator :)Max
@brunonery look at the example on the boost page, it very similar to this. In my code you can almost just replace makeLazyIterMap with boost::make_transform_iterator. However my counter class isn't compatable with boosts standard iterators it seems. If you replace it with one of boosts ranges it should work.Vmail
A
2

I think the boost::rangex library is what you are looking for. It should work nicely with the new c++lambda syntax.

Atomism answered 30/10, 2012 at 17:28 Comment(7)
Can one have infinite ranges with boost::rangex?Max
boost.org/doc/libs/1_51_0/libs/range/doc/html/range/reference/…Atomism
Yes I think you can as they are based on iterators. A range just has a begin and an end iterator. No reason your end cannot be infinity for whatever type you are generatingAtomism
I'm drooling right now, bradgonesurfing. I'll take a deeper look at this. The syntax with pipes look amazing :)Max
The helper you want is boost.org/doc/libs/1_47_0/libs/iterator/doc/…Atomism
It specifically mentions lazy generatorsAtomism
And if you want to look at something real nice/scary pfultz2.github.com/LinqAtomism
T
1
int pipe1(int val) {
    return 2*val;
}

int pipe2(int val) {
    return val+1;
}

int sink(int val) {
    return val*3;
}

for(int i=0; i < SOME_MAX; ++i)
{
    cout << sink(pipe2(pipe1(i))) << endl;
}

I know, it's not quite what you were expecting, but it certainly evaluates at the time you want it to, although not with an iterator iterface. A very related article is this:

Component programming in D

Edit 6/Nov/12:

An alternative, still sticking to bare C++, is to use function pointers and construct your own piping for the above functions (vector of function pointers from SO q: How can I store function pointer in vector?):

typedef std::vector<int (*)(int)> funcVec;
int runPipe(funcVec funcs, int sinkVal) {
    int running = sinkVal;
    for(funcVec::iterator it = funcs.begin(); it != funcs.end(); ++it) {
        running = (*(*it))(running); // not sure of the braces and asterisks here
    }
    return running;
}

This is intended to run through all the functions in a vector of such and return the resulting value. Then you can:

funcVec funcs;
funcs.pushback(&pipe1);
funcs.pushback(&pipe2);
funcs.pushback(&sink);

for(int i=0; i < SOME_MAX; ++i)
{
    cout << runPipe(funcs, i) << endl;
}

Of course you could also construct a wrapper for that via a struct (I would use a closure if C++ did them...):

struct pipeWork {
     funcVec funcs;
     int run(int i);
};

int pipeWork::run(int i) {
    //... guts as runPipe, or keep it separate and call:
    return runPipe(funcs, i);
}

// later...
pipeWork kitchen;
kitchen.funcs = someFuncs;
int (*foo) = &kitchen.run();

cout << foo(5) << endl;

Or something like that. Caveat: No idea what this will do if the pointers are passed between threads.

Extra caveat: If you want to do this with varying function interfaces, you will end up having to have a load of void *(void *)(void *) functions so that they can take whatever and emit whatever, or lots of templating to fix the kind of pipe you have. I suppose ideally you'd construct different kinds of pipe for different interfaces between functions, so that a | b | c works even when they are passing different types between them. But I'm going to guess that that's largely what the Boost stuff is doing.

Trencher answered 30/10, 2012 at 17:19 Comment(5)
Specially because I cannot use an infinite generator there.Max
@Vlad: it is lazy in the sense that you call the functions when you are about to push the value out, rather than generating lists of intermediate values; the memory requirement is just one value at a time. However, it is not lazy in the sense that evaluation can be separated from the declaration.Trencher
That's interesting, PhilH - but D is not an option, unfortunately. Is there something as components for C++?Max
@brunonery: that's what I've been hoping for for some time.Trencher
@brunonery: the edit should permit infinite generators, passing the compound function around etc.Trencher
N
-4

Depending on the simplicity of the functions :

#define pipe1(x) 2*x
#define pipe2(x) pipe1(x)+1

#define sink(x) pipe2(x)*3

int j = 1
while( ++j > 0 )
{
    std::cout << sink(j) << std::endl;
}
Negativism answered 30/10, 2012 at 17:29 Comment(2)
This looks very close to PhilH answer above, but 1. you don't even call the pipeline anywhere and 2. ++j, when j=1, will always be >0.Max
1. OK typo error, you shoud read cout << sink(j) << end; 2. I know it is an infinite loop, it was just to recreate the count(1) which don't end neither. The solution is close to PhilH's one, but I think preprocessor directives closer to Python lambda expression than function IMHONegativism

© 2022 - 2024 — McMap. All rights reserved.