Why doesnt my cartesian product function work?
Asked Answered
F

1

7

Consider the following function, whose output is supposed to be the cartesian product of a sequence of iterables:

def cart(*iterables):
    out = ((e,) for e in iterables[0])
    for iterable in iterables[1:]:
        out = (e1 + (e2,) for e1 in out for e2 in iterable)
    return out

Works fine when generator comprehensions are replaced with list comprehensions. Also works when there are only 2 iterables. But when I try

print(list(cart([1, 2, 3], 'ab', [4, 5])))

I get

[(1, 4, 4), (1, 4, 5), (1, 5, 4), (1, 5, 5),
 (2, 4, 4), (2, 4, 5), (2, 5, 4), (2, 5, 5),
 (3, 4, 4), (3, 4, 5), (3, 5, 4), (3, 5, 5)]

Why this and not the cartesian product?

Footpad answered 13/7, 2017 at 10:20 Comment(2)
You could store intermediate results in memory (like the list approach that works) and not defer their evaluation with that gen. exp. whose values are repeatedly changing across iterations.Colemancolemanite
I know that this question is about implementing the algorithm for Cartesian product in Python, but just in case someone ends up here searching how to do a Cartesian product in Python, note that this is already implemented in itertools.product.Fraternize
K
8

You are creating generator expressions, that are not iterated over until the next iteration of the for iterable in iterables[1:]: loop. They are using closures, which are looked up at runtime.

Generator expressions are essentially small functions in this regard, they create their own scope, and any names from the parent scope need to be treated as closures for this to work. The 'function' is executed when you iterate, and only then the closure is needed and resolved to the current value of the variable referenced.

So you create one generator expression like this:

(e1 + (e2,) for e1 in out for e2 in iterable)

where iterable is a closure taken from the parent scope (your function locals). But the lookup is not done until the next iteration when you loop, at which point iterable is the next element in the sequence.

So for your input of [1, 2, 3], 'ab', [4, 5], you create a generator expression when iterable = 'ab' but by the time you actually iterate, the for loop has assigned a new value and is now iterable = [4, 5]. When you finally iterate over the final (chained) generator, only the very last assignment to iterable counts.

You are effectively creating a product over iterables[0], iterables[-1] * len(iterables) - 1; iterables[1] through to iterables[-2] are skipped over entirely, all replaced by iterables[-1].

You could use a generator function to avoid the closure issue, passing in iterable to be bound to a local:

def gen_step(out, iterable):
    for e1 in out:
        for e2 in iterable:
            yield e1 + (e2,)

def cart(*iterables):
    out = ((e,) for e in iterables[0])
    for iterable in iterables[1:]:
        out = gen_step(out, iterable)
    return out

You could do the same with a lambda returning the generator expression:

def cart(*iterables):
    out = ((e,) for e in iterables[0])
    for iterable in iterables[1:]:
        out = (lambda it=iterable: (e1 + (e2,) for e1 in out for e2 in it))()
    return out
Kingfisher answered 13/7, 2017 at 10:30 Comment(1)
The alternatives are still lazy. Nice.Colemancolemanite

© 2022 - 2024 — McMap. All rights reserved.