Is there a functional way to do this?

I

4

5

def flattenList(toFlatten):
 final=[]
 for el in toFlatten:
  if isinstance(el, list):
   final.extend(flattenList(el))
  else:
   final.append(el)
 return final

When I don't know how deeply the lists will nest, this is the only way I can think to do this.

Idler answered 18/3, 2010 at 16:9 Comment(2)

Try using four spaces instead of one for your indentation. It is more readable and will conform to the style guidelines used for most Python code. (The style guide most Python code conforms to is python.org/dev/peps/pep-0008) – Ordinand 18/3, 2010 at 18:52

Related questions: "Flatten (an irregular) list of lists in Python" #2158895 (links to other questions and good answers) "Flattening a shallow list in python" #406621 (benchmarks) – Rochdale 18/3, 2010 at 19:35

C

1

Here's another option (though there may be something cleaner than type-checking, like testing if something is iterable and hence not an "atom"):

def flatten(lst):
    if not isinstance(lst,list):
        return [lst]
    else:
        return reduce(lambda x,y:x+y,[flatten(x) for x in lst],[])

It's based on something scheme-like.

Coccyx answered 18/3, 2010 at 17:7 Comment(17)

try: return reduce(...); except TypeError: return [lst] – Heald 18/3, 2010 at 17:14

@mitmatt, 1) The function lambda x, y: x + y is called operator.add. 2) Your algorithm is so unnecessarily O(n * m)ish, when the linear algorithm is more obvious. – Ordinand 18/3, 2010 at 17:26

@Debilski, Normally exception handling would be better form, but recursive flattening is tricky. Try thinking about what will happen if you have a string in there! – Ordinand 18/3, 2010 at 17:28

@Mike Graham: True. That wouldn’t be nice. – Heald 18/3, 2010 at 17:30

@Mike, I'm not sure what your m and n refer to, but the code I provided is linear in the number of atoms in the deep-list argument: each atom is listified exactly once, and the reduce will only call + once for each of its recursively-flattened args (meaning only one append per atom). Calling operator.add means another import statement, but it's certainly a valid alternative! – Coccyx 18/3, 2010 at 17:45

@mitmatt, I posted an answer that hopefully explains the quadratic nature of your algorithm. – Ordinand 18/3, 2010 at 17:55

Easier yet: replace reduce(lambda x,y:x+y, with sum(. – Heald 18/3, 2010 at 17:55

@Mike, your post seems to have a mistake: concatenating linked lists doesn't require copying the elements, but rather just a single pointer operation. Each of those + operations is constant time. – Coccyx 18/3, 2010 at 18:3

@Debliski, This is still quadratic, which is completely unavoidable. Also, sum defaults to starting to sum with 0, which isn't appropriate in this case; you'd have to pass a start param. – Ordinand 18/3, 2010 at 18:5

@mitmatt, Python lists aren't linked lists. They are mutable array-like objects that copy when you use +. – Ordinand 18/3, 2010 at 18:6

@Mike Graham: The start param would be there, if following the replacement rule I gave. But I did not mean that it would generally be faster, it was just another option to calling operator.add. – Heald 18/3, 2010 at 18:8

Though the sum is faster than reduce + lambda. – Heald 18/3, 2010 at 18:9

@Mike, that's actually implementation-dependent and not specified by the Python language, but I think you'll find that the CPython implementation provides (amortized) constant-time appends: wiki.python.org/moin/TimeComplexity – Coccyx 18/3, 2010 at 18:13

@Deblinski, Using reduce or sum are both slow, because they are O(n^2) and the right algorithm to use is O(n) – Ordinand 18/3, 2010 at 18:13

My mistake: the appropriate function in that reference is probably "extend", which does indeed require more operations. From a data structures perspective, that's not entirely necessary (e.g. [1]), but it does seem to be the case with CPython. [1] en.wikipedia.org/wiki/Deque#Implementations – Coccyx 18/3, 2010 at 18:19

@mittmatt, collections.deque is implemented using a doubly-linked list. listis implemented using a resizing array, and supports the appropriate operations (like O(1) indexing). In any event, even if list was a doubly linked list (which it isn't!), it isn't clear whether the right thing to do with concatenation is copying or not, if it is mutable. – Ordinand 18/3, 2010 at 18:34

@mitmatt, anything that doesn't have algorithmic complexities identical to list in CPython for its list and that is trying to claim to be Python should have its developers shot. – Ordinand 18/3, 2010 at 18:38

O

7

You should avoid typechecking in Python. In this case, this means avoiding arbitrarily-nested structures where you distinguish by type. You can build your own node type which you can traverse by methods other than typechecking, like looking at a specific attribute.
For flattening one level or exactly n levels, look at itertools.chain.from_iterable.
I don't know what you mean by "functional". This code is pretty functional: it uses recursion (not to its credit!) and it doesn't mutate its argument. (Strictly speaking, it does use mutable state for building a list, but that is just how you do it in Python.

I suppose one more functional attribute would be lazy evaluation. You could implement this thusly

def flatten(toFlatten):
    for item in toFlatten:
        if isinstance(item, list): # Ewww, typchecking
            for subitem in flatten(item): # they are considering adding 
                yield subitem             # "yield from" to the  language
                                          # to give this pattern syntax
        else:
            yield item

Recursion is very limited in Python (at least, in all its major implementations) and should generally be avoided for arbitrary depth. It is quite possible to rewrite this (and all recursive code) to use iteration, which will make this more scalable (and less functional, which is a good thing in Python, which is not especially suited for FP.)

Ordinand answered 18/3, 2010 at 16:55 Comment(5)

@Mike Graham: I want to be able to pass in lists containing lists containing lists, containing lists, etc., and flatten them all the way down to one single result. I want: [1,2,[3,4,5,6], 7,8,[9,10,[11,12,[13,[14,15],16],17],20]] To come out as: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 20] If I knew in advance how deeply the lists nested, this would suffice: def mergeLists(seq): return reduce(lambda x,y: x+y, seq) – Idler 18/3, 2010 at 17:11

1) Stop wanting that. Define your structure differently. 2) Your reduce strategy is multiplicative in big-O order; the linear algorithms are obvious. – Ordinand 18/3, 2010 at 17:24

A more general answer: #2158895 – Rochdale 18/3, 2010 at 19:38

I like that answer less than this bad answer. Typechecking sucks, but if you're using type to indicate data, it should be a very specific type. For example, I should be free to make my own string-ish type that iterates of length-1 sequences of itself and not subclass basestring. If I was going to use type to indicate this information, not only would I limit it to exactly list, I'd probably subclass list so that I could typecheck for exactly what I want. – Ordinand 18/3, 2010 at 19:46

It's worth noting nested lists like [1,2,[3,4,5,6], 7,8,[9,10,[11,12,[13,[14,15],16],17],20]] are really trees: Node([Leaf(1),Leaf(2),Node([Leaf(3),Leaf(4)... etc. – Homelike 18/3, 2010 at 20:5

O

3

This answer explains why you do not want to use reduce for this in Python.

Consider the snippet

reduce(operator.add, [[1], [2], [3], [4], [5]])

What does this have to do?

[1] + [2] => [1, 2]
[1, 2] + [3] => This makes a new list, having to go over 1, then 2, then 3. [1, 2, 3]
[1, 2, 3] + [4] => This has to copy the 1, 2, and 3 and then put 4 in the new list
[1, 2, 3, 4] + [5] => The length of stuff I have to copy gets bigger each time!

This quadratic behavior is completely avoidable: the original solution (and any number of other solutions) does not form these intermediate copying steps.

Ordinand answered 18/3, 2010 at 17:54 Comment(0)

F

2

Under the doc for itertools, there's a flatten() function

Forland answered 18/3, 2010 at 17:11 Comment(3)

Not on that page there isn't. – Appointed 18/3, 2010 at 17:14

@Andrew Aylett, It is a recipe, not in the module itself. It's on that page. – Ordinand 18/3, 2010 at 17:55

@Mike: Admit it, you edited the documentation after I posted my comment. Not sure how I missed that -- it didn't come up when I searched the page at work :P. – Appointed 18/3, 2010 at 19:1

C

1

Here's another option (though there may be something cleaner than type-checking, like testing if something is iterable and hence not an "atom"):

def flatten(lst):
    if not isinstance(lst,list):
        return [lst]
    else:
        return reduce(lambda x,y:x+y,[flatten(x) for x in lst],[])