Quickest I've found is to start with an empty array and extend it:
In [1]: a = [['abc', 'def'], ['ghi'],['xzy']]
In [2]: result = []
In [3]: extend = result.extend
In [4]: for l in a:
...: extend(l)
...:
In [5]: result
Out[5]: ['abc', 'def', 'ghi', 'xzy']
This is over twice as fast for the example in Alex Martelli's attempt on: Making a flat list out of list of lists in Python
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 86.3 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'b = []' 'extend = b.extend' 'for sub in l:' ' extend(sub)'
10000 loops, best of 3: 36.6 usec per loop
I came up with this because I had a hunch that behind the scenes, extend would allocate the right amount of memory for the list, and probably uses some low-level code to move items in. I have no idea if this is true, but who cares, it is faster.
By the way, it's only a linear speedup:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]' 'b = []' 'extend = b.extend' 'for sub in l:' ' extend(sub)'
1000000 loops, best of 3: 0.844 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]' '[item for sublist in l for item in sublist]'
1000000 loops, best of 3: 1.56 usec per loop
You can also use the map(results.extend, a)
, but this is slower as it is building its own list of Nones.
It also gives you some of the benefits of not using functional programming. i.e.
- you can extend an existing list instead of creating an empty one,
- you can still understand the code at a glance, minutes, days or even months later.
By the way, probably best to avoid list comprehensions. Small ones aren't too bad, but in general list comprehensions don't actually save you much typing, but are often harder to understand and very hard to change or refactor (ever seen a three level list comprehension?). Google coding guidelines advise against them except in simple cases. My opinion is that they are only useful in 'throw-away' code, i.e. code where the author doesn't care about readability, or code that is known to never require future maintenance.
Compare these two ways of writing the same thing:
result = [item for sublist in l for item in sublist]
with this:
result = []
for sublist in l:
for item in sublist:
result.append(item)
YMMV, but the first one stopped me in my tracks and I had to think about it. In the second the nesting is made obvious from the indentation.