Transform "list of tuples" into a flat list or a matrix

Asked 17/5, 2012 at 9:16 Answered 18/4, 2023 at 20:18

115

With Sqlite, a select .. from command returns the results output, which prints:

>>print output
[(12.2817, 12.2817), (0, 0), (8.52, 8.52)]

It seems to be a list of tuples. I would like to either convert output to a simple list:

[12.2817, 12.2817, 0, 0, 8.52, 8.52]

or a 2x3 matrix:

12.2817 12.2817
0          0 
8.52     8.52

to be read via output[i][j]

The flatten command does not do the job for the 1st option, and I have no idea for the second one...

A fast solution would be appreciated, as the real data is much bigger.

Endocrine answered 17/5, 2012 at 9:16 Comment(4)

[(12.2817, 12.2817), (0, 0), (8.52, 8.52)] is already a 3x2 matrix !? or did i miss something ? – Prunelle 17/5, 2012 at 9:24

See this question – Obvert 17/5, 2012 at 9:26

for the flatten function check itertools module recipes there is already a flatten function example: docs.python.org/library/itertools.html#recipes – Prunelle 17/5, 2012 at 9:30

[item for sublist in output for item in sublist] works perfectly and has the advantage that your inner tuples could also be lists; more generally any combination of inner and outer iterable works – Keg 5/5, 2013 at 18:59

161

By far the fastest (and shortest) solution posted:

list(sum(output, ()))

About 50% faster than the itertools solution, and about 70% faster than the map solution.

Obvert answered 17/5, 2012 at 13:22 Comment(6)

@Joel nice, but I wonder how it works? list(output[0]+output[1]+output[2]) gives the desired result but list(sum(output)) not. Why? What "magic" does the () do? – Keg 5/5, 2013 at 18:39

Ok, I should have read the manual g. It seems sum(sequence[, start]): sum adds start which defaults to 0 rather then just starting from sequence[0] if it exists and then adding the rest of the elements. Sorry for bothering you. – Keg 5/5, 2013 at 18:44

This is a well-known anti-pattern: don't use sum to concatenate sequences, it results in a quadratic time algorithm. Indeed, the sum function will complain if you try to do this with strings! – Saline 11/4, 2018 at 22:42

@juanpa.arrivillaga: agreed. There are very few use cases where this would be preferable. – Obvert 11/4, 2018 at 23:19

Yes, fast but completely obtuse. You'd have to leave a comment as to what it is actually doing :( – Mincemeat 22/5, 2018 at 21:30

See my answer for a comparison of this to other techniques which are faster and more Pythonic IMO. – Beadle 11/2, 2019 at 21:52

List comprehension approach that works with Iterable types and is faster than other methods shown here.

flattened = [item for sublist in l for item in sublist]

l is the list to flatten (called output in the OP's case)

timeit tests:

l = list(zip(range(99), range(99)))  # list of tuples to flatten

List comprehension

[item for sublist in l for item in sublist]

timeit result = 7.67 µs ± 129 ns per loop

List extend() method

flattened = []
list(flattened.extend(item) for item in l)

timeit result = 11 µs ± 433 ns per loop

sum()

list(sum(l, ()))

timeit result = 24.2 µs ± 269 ns per loop

Beadle answered 11/7, 2018 at 17:14 Comment(3)

I had to use this on a large dataset, the list comprehension method was by far the fastest! – Skyros 16/8, 2018 at 6:36

I did a little change to the .extend solution and now performs a bit better. check it out on your timeit to compare – Avunculate 20/11, 2018 at 17:0

this is very confusing and I don't understand the syntax here at all. General syntax for list comprehension is expression for item in list like x*2 for x in listONumbers. So for flattening you would expect an expression like num for num in sublist for sublist in list not num for sublist in list for num in sublist. how is in the comprehension broken down here? – Eyrir 10/3, 2023 at 1:6

In Python 2.7, and all versions of Python3, you can use itertools.chain to flatten a list of iterables. Either with the * syntax or the class method.

>>> t = [ (1,2), (3,4), (5,6) ]
>>> t
[(1, 2), (3, 4), (5, 6)]
>>> import itertools
>>> list(itertools.chain(*t))
[1, 2, 3, 4, 5, 6]
>>> list(itertools.chain.from_iterable(t))
[1, 2, 3, 4, 5, 6]

Zimmer answered 5/2, 2016 at 16:1 Comment(0)

Update: Flattening using extend but without comprehension and without using list as iterator (fastest)

After checking the next answer to this that provided a faster solution via a list comprehension with dual for I did a little tweak and now it performs better, first the execution of list(...) was dragging a big percentage of time, then changing a list comprehension for a simple loop shaved a bit more as well.

The new solution is:

l = []
for row in output: l.extend(row)

The old one replacing list with [] (a bit slower but not much):

[l.extend(row) for row in output]

Older (slower):

Flattening with list comprehension

l = []
list(l.extend(row) for row in output)

some timeits for new extend and the improvement gotten by just removing list(...) for [...]:

import timeit
t = timeit.timeit
o = "output=list(zip(range(1000000000), range(10000000))); l=[]"
steps_ext = "for row in output: l.extend(row)"
steps_ext_old = "list(l.extend(row) for row in output)"
steps_ext_remove_list = "[l.extend(row) for row in output]"
steps_com = "[item for sublist in output for item in sublist]"

print(f"{steps_ext}\n>>>{t(steps_ext, setup=o, number=10)}")
print(f"{steps_ext_remove_list}\n>>>{t(steps_ext_remove_list, setup=o, number=10)}")
print(f"{steps_com}\n>>>{t(steps_com, setup=o, number=10)}")
print(f"{steps_ext_old}\n>>>{t(steps_ext_old, setup=o, number=10)}")

Time it results:

for row in output: l.extend(row)                  
>>> 7.022608777000187

[l.extend(row) for row in output]
>>> 9.155910597999991

[item for sublist in output for item in sublist]
>>> 9.920002304000036

list(l.extend(row) for row in output)
>>> 10.703829122000116

Avunculate answered 13/9, 2017 at 15:10 Comment(0)

>>> flat_list = []
>>> nested_list = [(1, 2, 4), (0, 9)]
>>> for a_tuple in nested_list:
...     flat_list.extend(list(a_tuple))
... 
>>> flat_list
[1, 2, 4, 0, 9]
>>>

you could easily move from list of tuple to single list as shown above.

Improper answered 17/5, 2012 at 9:36 Comment(0)

use itertools chain:

>>> import itertools
>>> list(itertools.chain.from_iterable([(12.2817, 12.2817), (0, 0), (8.52, 8.52)]))
[12.2817, 12.2817, 0, 0, 8.52, 8.52]

Mendy answered 17/5, 2012 at 11:10 Comment(0)

Or you can flatten the list like this:

reduce(lambda x,y:x+y, map(list, output))

Zama answered 17/5, 2012 at 11:5 Comment(1)

reduce(lambda x,y:x+y, output) seems to work directly converting to a long tuple (which can be converted to a list). Why use map(list, output) inside the reduce() call? Maybe It's more in line with the fact that tuples are immutable, lists are mutable. – Leid 20/3, 2019 at 14:47

This is what numpy was made for, both from a data structures, as well as speed perspective.

import numpy as np

output = [(12.2817, 12.2817), (0, 0), (8.52, 8.52)]
output_ary = np.array(output)   # this is your matrix 
output_vec = output_ary.ravel() # this is your 1d-array

Chas answered 20/9, 2018 at 16:10 Comment(0)

In case of arbitrary nested lists(just in case):

def flatten(lst):
    result = []
    for element in lst: 
        if hasattr(element, '__iter__'):
            result.extend(flatten(element))
        else:
            result.append(element)
    return result

>>> flatten(output)
[12.2817, 12.2817, 0, 0, 8.52, 8.52]

Argentic answered 17/5, 2012 at 10:3 Comment(0)

def flatten_tuple_list(tuples):
    return list(sum(tuples, ()))


tuples = [(5, 6), (6, 7, 8, 9), (3,)]
print(flatten_tuple_list(tuples))

Rickyrico answered 5/2, 2021 at 16:58 Comment(2)

Thank you for contributing an answer. Would you kindly edit your answer to to include an explanation of your code? That will help future readers better understand what is going on, and especially those members of the community who are new to the language and struggling to understand the concepts. That’s especially useful here, where your answer is competing for attention with nine other answers. What distinguishes yours? When might this be preferred over well-established answers above? – Cephalad 6/2, 2021 at 1:2

Ok sure I will do that – Rickyrico 7/2, 2021 at 4:54

The question mentions that the list of tuples (output) is returned by Sqlite select .. from command.

Instead of flattening the returned output, you could adjust how sqlite connection returns rows using row_factory to return a matrix (list of lists/nested lists) with numeric values instead of a list with tuples:

import sqlite3 as db

conn = db.connect('...')
conn.row_factory = lambda cursor, row: list(row) # This will convert the tuple to list.
c = conn.cursor()
output = c.execute('SELECT ... FROM ...').fetchall()
print(output)
# Should print [[12.2817, 12.2817], [0, 0], [8.52, 8.52]]

Shriver answered 18/4, 2023 at 20:18 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

timeit tests:

List comprehension

List extend() method

sum()

Recommended topics

Hot tags