Comprehension for flattening a sequence of sequences? [duplicate]
Asked Answered
A

4

44

If I have sequence of sequences (maybe a list of tuples) I can use itertools.chain() to flatten it. But sometimes I feel like I would rather write it as a comprehension. I just can't figure out how to do it. Here's a very construed case:

Let's say I want to swap the elements of every pair in a sequence. I use a string as a sequence here:

>>> from itertools import chain
>>> seq = '012345'
>>> swapped_pairs = zip(seq[1::2], seq[::2])
>>> swapped_pairs
[('1', '0'), ('3', '2'), ('5', '4')]
>>> "".join(chain(*swapped_pairs))
'103254'

I use zip on the even and odd slices of the sequence to swap the pairs. But I end up with a list of tuples that now need to be flattened. So I use chain(). Is there a way I could express it with a comprehension instead?

If you want to post your own solution to the basic problem of swapping elements of the pairs, go ahead, I'll up-vote anything that teaches me something new. But I will only mark as accepted an answer that is targeted on my question, even if the answer is "No, you can't.".

Arronarrondissement answered 19/1, 2009 at 10:47 Comment(1)
related: Flattening a shallow list in PythonEmersed
H
32

With a comprehension? Well...

>>> seq = '012345'
>>> swapped_pairs = zip(seq[1::2], seq[::2])
>>> ''.join(item for pair in swapped_pairs for item in pair)
'103254'
Hesta answered 19/1, 2009 at 10:58 Comment(0)
P
16

Quickest I've found is to start with an empty array and extend it:

In [1]: a = [['abc', 'def'], ['ghi'],['xzy']]

In [2]: result = []

In [3]: extend = result.extend

In [4]: for l in a:
   ...:     extend(l)
   ...: 

In [5]: result
Out[5]: ['abc', 'def', 'ghi', 'xzy']

This is over twice as fast for the example in Alex Martelli's attempt on: Making a flat list out of list of lists in Python

$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 86.3 usec per loop

$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99'  'b = []' 'extend = b.extend' 'for sub in l:' '    extend(sub)'
10000 loops, best of 3: 36.6 usec per loop

I came up with this because I had a hunch that behind the scenes, extend would allocate the right amount of memory for the list, and probably uses some low-level code to move items in. I have no idea if this is true, but who cares, it is faster.

By the way, it's only a linear speedup:

$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]'  'b = []' 'extend = b.extend' 'for sub in l:' '    extend(sub)'
1000000 loops, best of 3: 0.844 usec per loop

$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]' '[item for sublist in l for item in sublist]'
1000000 loops, best of 3: 1.56 usec per loop

You can also use the map(results.extend, a), but this is slower as it is building its own list of Nones.

It also gives you some of the benefits of not using functional programming. i.e.

  • you can extend an existing list instead of creating an empty one,
  • you can still understand the code at a glance, minutes, days or even months later.

By the way, probably best to avoid list comprehensions. Small ones aren't too bad, but in general list comprehensions don't actually save you much typing, but are often harder to understand and very hard to change or refactor (ever seen a three level list comprehension?). Google coding guidelines advise against them except in simple cases. My opinion is that they are only useful in 'throw-away' code, i.e. code where the author doesn't care about readability, or code that is known to never require future maintenance.

Compare these two ways of writing the same thing:

result = [item for sublist in l for item in sublist]

with this:

result = []
for sublist in l:
    for item in sublist:
        result.append(item)

YMMV, but the first one stopped me in my tracks and I had to think about it. In the second the nesting is made obvious from the indentation.

Pasture answered 16/3, 2011 at 18:44 Comment(2)
(No offence to Alex who is a python super-hero).Pasture
It is a matter of familiarity e.g., the list comprehension and the explicit nested loop looks similar to me (equally readable).Emersed
M
3

You could use reduce to achive your goal:

In [6]: import operator
In [7]: a = [(1, 2), (2,3), (4,5)]
In [8]: reduce(operator.add, a, ())
Out[8]: (1, 2, 2, 3, 4, 5)

This return a tuple instead of a list because the elements in your original list are tuples that get concatenated. But you can easily build a list from that and the join method accepts tuples, too.

A list comprehension is, by the way, not the right tool for that. Basically a list comprehension builds a new list by describing how the elements of this list should look like. You want to reduce a list of elements to only one value.

Mew answered 19/1, 2009 at 10:54 Comment(3)
-1: overhead of creating a new tuple each iteration would be too slow on a big list. also, reduce(operator.add, X, Y) is unreadable. Use sum(X, Y) instead.Hesta
For me sum() has a too strong implication of an arithmetic operation. operator.add(), too, but thats the function we have to use for representing X + Y. So I think this is better to read than sum(). And speed wasn't in the requirements.Mew
@heikogerlach: Do you really think reduce(operator.add, X, Y) is more readable than sum(X, Y)????? reduce is almost always unreadable. It has been removed from python builtins in python3.0 for that very reason. A for loop is almost always more readable.Hesta
B
1
>>> a = [(1, 2), (3, 4), (5, 6)]
>>> reduce(tuple.__add__, a)
>>> (1, 2, 3, 4, 5, 6)

Or, to be agnostic about the type of inner sequences (as long as they are all the same):

>>> reduce(a[0].__class__.__add__, a)
Broddy answered 19/1, 2009 at 14:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.