It works because addition is overloaded (on tuples) to return the concatenated tuple:
>>> () + ('hello',) + ('these', 'are') + ('my', 'tuples!')
('hello', 'these', 'are', 'my', 'tuples!')
That's basically what sum
is doing, you give an initial value of an empty tuple and then add the tuples to that.
However this is generally a bad idea because addition of tuples creates a new tuple, so you create several intermediate tuples just to copy them into the concatenated tuple:
()
('hello',)
('hello', 'these', 'are')
('hello', 'these', 'are', 'my', 'tuples!')
That's an implementation that has quadratic runtime behavior. That quadratic runtime behavior can be avoided by avoiding the intermediate tuples.
>>> tuples = (('hello',), ('these', 'are'), ('my', 'tuples!'))
Using nested generator expressions:
>>> tuple(tuple_item for tup in tuples for tuple_item in tup)
('hello', 'these', 'are', 'my', 'tuples!')
Or using a generator function:
def flatten(it):
for seq in it:
for item in seq:
yield item
>>> tuple(flatten(tuples))
('hello', 'these', 'are', 'my', 'tuples!')
Or using itertools.chain.from_iterable
:
>>> import itertools
>>> tuple(itertools.chain.from_iterable(tuples))
('hello', 'these', 'are', 'my', 'tuples!')
And if you're interested how these perform (using my simple_benchmark
package):
import itertools
import simple_benchmark
def flatten(it):
for seq in it:
for item in seq:
yield item
def sum_approach(tuples):
return sum(tuples, ())
def generator_expression_approach(tuples):
return tuple(tuple_item for tup in tuples for tuple_item in tup)
def generator_function_approach(tuples):
return tuple(flatten(tuples))
def itertools_approach(tuples):
return tuple(itertools.chain.from_iterable(tuples))
funcs = [sum_approach, generator_expression_approach, generator_function_approach, itertools_approach]
arguments = {(2**i): tuple((1,) for i in range(1, 2**i)) for i in range(1, 13)}
b = simple_benchmark.benchmark(funcs, arguments, argument_name='number of tuples to concatenate')
b.plot()
(Python 3.7.2 64bit, Windows 10 64bit)
So while the sum
approach is very fast if you concatenate only a few tuples it will be really slow if you try to concatenate lots of tuples. The fastest of the tested approaches for many tuples is itertools.chain.from_iterable
tuple(chain(*tuples))
– Uranochain
like that as it's even more inefficient thansum
(unless the collection of tuples is very small). Usechain.from_iterable
instead. – Chitter