Is there a functional way to do this?
Asked Answered
I

4

5
def flattenList(toFlatten):
 final=[]
 for el in toFlatten:
  if isinstance(el, list):
   final.extend(flattenList(el))
  else:
   final.append(el)
 return final

When I don't know how deeply the lists will nest, this is the only way I can think to do this.

Idler answered 18/3, 2010 at 16:9 Comment(2)
Try using four spaces instead of one for your indentation. It is more readable and will conform to the style guidelines used for most Python code. (The style guide most Python code conforms to is python.org/dev/peps/pep-0008)Ordinand
Related questions: "Flatten (an irregular) list of lists in Python" #2158895 (links to other questions and good answers) "Flattening a shallow list in python" #406621 (benchmarks)Rochdale
C
1

Here's another option (though there may be something cleaner than type-checking, like testing if something is iterable and hence not an "atom"):

def flatten(lst):
    if not isinstance(lst,list):
        return [lst]
    else:
        return reduce(lambda x,y:x+y,[flatten(x) for x in lst],[])

It's based on something scheme-like.

Coccyx answered 18/3, 2010 at 17:7 Comment(17)
try: return reduce(...); except TypeError: return [lst]Heald
@mitmatt, 1) The function lambda x, y: x + y is called operator.add. 2) Your algorithm is so unnecessarily O(n * m)ish, when the linear algorithm is more obvious.Ordinand
@Debilski, Normally exception handling would be better form, but recursive flattening is tricky. Try thinking about what will happen if you have a string in there!Ordinand
@Mike Graham: True. That wouldn’t be nice.Heald
@Mike, I'm not sure what your m and n refer to, but the code I provided is linear in the number of atoms in the deep-list argument: each atom is listified exactly once, and the reduce will only call + once for each of its recursively-flattened args (meaning only one append per atom). Calling operator.add means another import statement, but it's certainly a valid alternative!Coccyx
@mitmatt, I posted an answer that hopefully explains the quadratic nature of your algorithm.Ordinand
Easier yet: replace reduce(lambda x,y:x+y, with sum(.Heald
@Mike, your post seems to have a mistake: concatenating linked lists doesn't require copying the elements, but rather just a single pointer operation. Each of those + operations is constant time.Coccyx
@Debliski, This is still quadratic, which is completely unavoidable. Also, sum defaults to starting to sum with 0, which isn't appropriate in this case; you'd have to pass a start param.Ordinand
@mitmatt, Python lists aren't linked lists. They are mutable array-like objects that copy when you use +.Ordinand
@Mike Graham: The start param would be there, if following the replacement rule I gave. But I did not mean that it would generally be faster, it was just another option to calling operator.add.Heald
Though the sum is faster than reduce + lambda.Heald
@Mike, that's actually implementation-dependent and not specified by the Python language, but I think you'll find that the CPython implementation provides (amortized) constant-time appends: wiki.python.org/moin/TimeComplexityCoccyx
@Deblinski, Using reduce or sum are both slow, because they are O(n^2) and the right algorithm to use is O(n)Ordinand
My mistake: the appropriate function in that reference is probably "extend", which does indeed require more operations. From a data structures perspective, that's not entirely necessary (e.g. [1]), but it does seem to be the case with CPython. [1] en.wikipedia.org/wiki/Deque#ImplementationsCoccyx
@mittmatt, collections.deque is implemented using a doubly-linked list. listis implemented using a resizing array, and supports the appropriate operations (like O(1) indexing). In any event, even if list was a doubly linked list (which it isn't!), it isn't clear whether the right thing to do with concatenation is copying or not, if it is mutable.Ordinand
@mitmatt, anything that doesn't have algorithmic complexities identical to list in CPython for its list and that is trying to claim to be Python should have its developers shot.Ordinand
O
7
  1. You should avoid typechecking in Python. In this case, this means avoiding arbitrarily-nested structures where you distinguish by type. You can build your own node type which you can traverse by methods other than typechecking, like looking at a specific attribute.

  2. For flattening one level or exactly n levels, look at itertools.chain.from_iterable.

  3. I don't know what you mean by "functional". This code is pretty functional: it uses recursion (not to its credit!) and it doesn't mutate its argument. (Strictly speaking, it does use mutable state for building a list, but that is just how you do it in Python.

  4. I suppose one more functional attribute would be lazy evaluation. You could implement this thusly

    def flatten(toFlatten):
        for item in toFlatten:
            if isinstance(item, list): # Ewww, typchecking
                for subitem in flatten(item): # they are considering adding 
                    yield subitem             # "yield from" to the  language
                                              # to give this pattern syntax
            else:
                yield item
    
  5. Recursion is very limited in Python (at least, in all its major implementations) and should generally be avoided for arbitrary depth. It is quite possible to rewrite this (and all recursive code) to use iteration, which will make this more scalable (and less functional, which is a good thing in Python, which is not especially suited for FP.)

Ordinand answered 18/3, 2010 at 16:55 Comment(5)
@Mike Graham: I want to be able to pass in lists containing lists containing lists, containing lists, etc., and flatten them all the way down to one single result. I want: [1,2,[3,4,5,6], 7,8,[9,10,[11,12,[13,[14,15],16],17],20]] To come out as: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 20] If I knew in advance how deeply the lists nested, this would suffice: def mergeLists(seq): return reduce(lambda x,y: x+y, seq)Idler
1) Stop wanting that. Define your structure differently. 2) Your reduce strategy is multiplicative in big-O order; the linear algorithms are obvious.Ordinand
A more general answer: #2158895Rochdale
I like that answer less than this bad answer. Typechecking sucks, but if you're using type to indicate data, it should be a very specific type. For example, I should be free to make my own string-ish type that iterates of length-1 sequences of itself and not subclass basestring. If I was going to use type to indicate this information, not only would I limit it to exactly list, I'd probably subclass list so that I could typecheck for exactly what I want.Ordinand
It's worth noting nested lists like [1,2,[3,4,5,6], 7,8,[9,10,[11,12,[13,[14,15],16],17],20]] are really trees: Node([Leaf(1),Leaf(2),Node([Leaf(3),Leaf(4)... etc.Homelike
O
3

This answer explains why you do not want to use reduce for this in Python.

Consider the snippet

reduce(operator.add, [[1], [2], [3], [4], [5]])

What does this have to do?

[1] + [2] => [1, 2]
[1, 2] + [3] => This makes a new list, having to go over 1, then 2, then 3. [1, 2, 3]
[1, 2, 3] + [4] => This has to copy the 1, 2, and 3 and then put 4 in the new list
[1, 2, 3, 4] + [5] => The length of stuff I have to copy gets bigger each time!

This quadratic behavior is completely avoidable: the original solution (and any number of other solutions) does not form these intermediate copying steps.

Ordinand answered 18/3, 2010 at 17:54 Comment(0)
F
2

Under the doc for itertools, there's a flatten() function

Forland answered 18/3, 2010 at 17:11 Comment(3)
Not on that page there isn't.Appointed
@Andrew Aylett, It is a recipe, not in the module itself. It's on that page.Ordinand
@Mike: Admit it, you edited the documentation after I posted my comment. Not sure how I missed that -- it didn't come up when I searched the page at work :P.Appointed
C
1

Here's another option (though there may be something cleaner than type-checking, like testing if something is iterable and hence not an "atom"):

def flatten(lst):
    if not isinstance(lst,list):
        return [lst]
    else:
        return reduce(lambda x,y:x+y,[flatten(x) for x in lst],[])

It's based on something scheme-like.

Coccyx answered 18/3, 2010 at 17:7 Comment(17)
try: return reduce(...); except TypeError: return [lst]Heald
@mitmatt, 1) The function lambda x, y: x + y is called operator.add. 2) Your algorithm is so unnecessarily O(n * m)ish, when the linear algorithm is more obvious.Ordinand
@Debilski, Normally exception handling would be better form, but recursive flattening is tricky. Try thinking about what will happen if you have a string in there!Ordinand
@Mike Graham: True. That wouldn’t be nice.Heald
@Mike, I'm not sure what your m and n refer to, but the code I provided is linear in the number of atoms in the deep-list argument: each atom is listified exactly once, and the reduce will only call + once for each of its recursively-flattened args (meaning only one append per atom). Calling operator.add means another import statement, but it's certainly a valid alternative!Coccyx
@mitmatt, I posted an answer that hopefully explains the quadratic nature of your algorithm.Ordinand
Easier yet: replace reduce(lambda x,y:x+y, with sum(.Heald
@Mike, your post seems to have a mistake: concatenating linked lists doesn't require copying the elements, but rather just a single pointer operation. Each of those + operations is constant time.Coccyx
@Debliski, This is still quadratic, which is completely unavoidable. Also, sum defaults to starting to sum with 0, which isn't appropriate in this case; you'd have to pass a start param.Ordinand
@mitmatt, Python lists aren't linked lists. They are mutable array-like objects that copy when you use +.Ordinand
@Mike Graham: The start param would be there, if following the replacement rule I gave. But I did not mean that it would generally be faster, it was just another option to calling operator.add.Heald
Though the sum is faster than reduce + lambda.Heald
@Mike, that's actually implementation-dependent and not specified by the Python language, but I think you'll find that the CPython implementation provides (amortized) constant-time appends: wiki.python.org/moin/TimeComplexityCoccyx
@Deblinski, Using reduce or sum are both slow, because they are O(n^2) and the right algorithm to use is O(n)Ordinand
My mistake: the appropriate function in that reference is probably "extend", which does indeed require more operations. From a data structures perspective, that's not entirely necessary (e.g. [1]), but it does seem to be the case with CPython. [1] en.wikipedia.org/wiki/Deque#ImplementationsCoccyx
@mittmatt, collections.deque is implemented using a doubly-linked list. listis implemented using a resizing array, and supports the appropriate operations (like O(1) indexing). In any event, even if list was a doubly linked list (which it isn't!), it isn't clear whether the right thing to do with concatenation is copying or not, if it is mutable.Ordinand
@mitmatt, anything that doesn't have algorithmic complexities identical to list in CPython for its list and that is trying to claim to be Python should have its developers shot.Ordinand

© 2022 - 2024 — McMap. All rights reserved.