Why is there no tuple comprehension in Python?
Asked Answered
I

13

533

As we all know, there's list comprehension, like

[i for i in [1, 2, 3, 4]]

and there is dictionary comprehension, like

{i:j for i, j in {1: 'a', 2: 'b'}.items()}

but

(i for i in (1, 2, 3))

will end up in a generator, not a tuple comprehension. Why is that?

My guess is that a tuple is immutable, but this does not seem to be the answer.

Inelegancy answered 5/6, 2013 at 12:44 Comment(3)
There's also a set comprehension -- which looks a lot like a dict comprehension...Legist
Just for the sake of posterity, there is a discussion about this going on in the Python ChatExorbitance
Apparently there is. https://mcmap.net/q/74875/-can-i-turn-a-generator-object-into-a-tuple-without-using-quot-tuple-quot-duplicateNatatorial
E
780

You can use a generator expression:

tuple(i for i in (1, 2, 3))

but parentheses were already taken for … generator expressions.

Erubescent answered 5/6, 2013 at 12:46 Comment(10)
By this argument, we could say a list-comprehension is unnecessary too: list(i for i in (1,2,3)). I really think it's simply because there isn't a clean syntax for it (or at least nobody has thought of one)Legist
A list or set or dict comprehension is just syntactic sugar to use a generator expression that outputs a specific type. list(i for i in (1, 2, 3)) is a generator expression that outputs a list, set(i for i in (1, 2, 3)) outputs a set. Does that mean the comprehension syntax is not needed? Perhaps not, but it is awfully handy. For the rare cases you need a tuple instead, the generator expression will do, is clear, and doesn't require the invention of another brace or bracket.Erubescent
The answer is obviously because tuple syntax and parenthesis are ambiguousRevenue
@MartijnPieters Wanted to point out that list(generator expression) isn't exactly the same as [generator expression]: thecodingforums.com/threads/…Bollard
@JKillian: the difference exists but is too subtle for the vast majority to have to care about. Playing with an iterator like that in the expression without handling the StopIterator exception is going to be rare enough. :-)Erubescent
@MartijnPieters Good point, although I stumbled across it the other day while trying to write my own zip function as an exercise to learn Python. Needless to say, it confused the heck out of me why tuple(gen exp) failed but tuple([gen exp]) worked perfectly. Anyways, good to have it noted here that StopIteration will be swallowed by generator expressions/comprehensions but will propagate out of list/set/dictionary comprehensions.Bollard
The difference between using a comprehension and using a constructor+generator is more than subtle if you care about performance. Comprehensions result in faster construction compared to using a generator passed to a constructor. In the latter case you are creating and executing functions and functions are expensive in Python. [thing for thing in things] constructs a list much faster than list(thing for thing in things). A tuple comprehension would not be useless; tuple(thing for thing in things) has latency issues and tuple([thing for thing in things]) could have memory issues.Florey
@MartijnPieters, Can you potentially reword A list or set or dict comprehension is just syntactic sugar to use a generator expression? It's causing confusion by people seeing these as equivalent means to an end. It's not technically syntactic sugar as the processes are actually different, even if the end product is the same.Greyso
@jpp: that's in a comment, not in my answer. Comments are generally not editable. Technically I can edit mine still, but only because I am a moderator. And I stand by my comment, as the *syntax is very close. Decorators are also syntactic sugar, and their implementation differs in important ways from the syntax they replaced, so this is not an isolated example. I am not convinced that one example of confusion equals general confusion.Erubescent
Related: Are list comprehensions syntactic sugar for list(generator expression) in Python 3?Greyso
L
108

Raymond Hettinger (one of the Python core developers) had this to say about tuples in a recent tweet:

#python tip: Generally, lists are for looping; tuples for structs. Lists are homogeneous; tuples heterogeneous. Lists for variable length.

This (to me) supports the idea that if the items in a sequence are related enough to be generated by a, well, generator, then it should be a list. Although a tuple is iterable and seems like simply a immutable list, it's really the Python equivalent of a C struct:

struct {
    int a;
    char b;
    float c;
} foo;

struct foo x = { 3, 'g', 5.9 };

becomes in Python

x = (3, 'g', 5.9)
Literary answered 5/6, 2013 at 13:28 Comment(6)
The immutibility property can be important though and often a good reason to use a tuple when you would normally use a list. For example, if you have a list of 5 numbers that you want to use as a key to a dict, then tuple is the way to go.Selfrenunciation
Thats a nice tip from Raymond Hettinger. I would still say there is a use case for using the tuple constructor with a generator, such as unpacking another structure, perhaps larger, into a smaller one by iterating over the attrs that you are interested in converting to a tuple record.Transubstantiate
@Transubstantiate You can probably just use operator.itemgetter in that case.Literary
@chepner, I see. That is pretty close to what I mean. It does return a callable so if I only need to do it once I don't see much of a win vs just using tuple(obj[item] for item in items) directly. In my case I was embedding this into a list comprehension to make a list of tuple records. If I need to do this repeatedly throughout the code then itemgetter looks great. Perhaps itemgetter would be more idiomatic either way?Transubstantiate
I see the relationship between frozenset and set analogous to that of tuple and list. It's less about heterogeneity and more about immutability - frozensets and tuples can be keys to dictionaries, lists and sets cannot due to their mutability.Connective
There's also a common case where you use a generator to produce a struct-like thing: where you're processing text records such as CSV. This is often written line_values = tuple(int(x.trim()) for x in line.split(',')). As others have noted, using the tuple constructor here instead of a comprehension has performance implications, and parsing large datasets of this type is a case where you really care about performance.Lexis
G
106

Since Python 3.5, you can also use splat * unpacking syntax to unpack a generator expression:

*(x for x in range(10)),
Geriatrician answered 24/11, 2017 at 15:34 Comment(8)
This is great (and it works), but I can't find anywhere it's documented! Do you have a link?Hypotenuse
Note: As an implementation detail, this is basically the same as doing tuple(list(x for x in range(10))) (the code paths are identical, with both of them building a list, with the only difference being that the final step is to create a tuple from the list and throw away the list when a tuple output is required). Means that you don't actually avoid a pair of temporaries.Mesonephros
To expand on the comment of @ShadowRanger, here's a question where they show that the splat+tuple literal syntax is actually quite a bit slower than passing a generator expression to the tuple constructor.Decencies
I'm trying this in Python 3.7.3 and *(x for x in range(10)) doesn't work. I get SyntaxError: can't use starred expression here. However tuple(x for x in range(10)) works.Kobayashi
@RyanH. you need put a comma in the end.Geriatrician
Just to add to comments, this is just a tuple expansion, since the tuple contains a generator expression, the <genexpr> gets expanded/unpacked while displaying. since tuple can be written without parentheses but just a comma, it worksLavonnelaw
ex: *range(10), NOTE: The comma at the end to indicate it is a tupleLavonnelaw
NOTE: tuple(x for x in range(10)) works coz the inner expression is a generator expression and it first returns an iterator, which is a single argument for the tuple, so the tuple is happy and uses the iterator of the <genexpr> to iterate over the items and create the final tuple. But for tuple(*range(10)) the range <genexpr> is unpacked and tuple constructor receives 10 items which it doesn't like and says: tuple expected at most 1 arguments, got 10. so the correct syntax for constructor is tuple(<genexpr>). Hence using tuple(range(10)) works.Lavonnelaw
L
79

As another poster macm mentioned in his answer, the fastest way to create a tuple from a generator is tuple([generator]).


Performance Comparison

  • List comprehension:

      $ python3 -m timeit "a = [i for i in range(1000)]"
      10000 loops, best of 3: 27.4 usec per loop
    
  • Tuple from list comprehension:

      $ python3 -m timeit "a = tuple([i for i in range(1000)])"
      10000 loops, best of 3: 30.2 usec per loop
    
  • Tuple from generator:

      $ python3 -m timeit "a = tuple(i for i in range(1000))"
      10000 loops, best of 3: 50.4 usec per loop
    
  • Tuple from unpacking:

      $ python3 -m timeit "a = *(i for i in range(1000)),"
      10000 loops, best of 3: 52.7 usec per loop
    

My version of python:

$ python3 --version
Python 3.6.3

So you should always create a tuple from a list comprehension unless performance is not an issue.

Lexis answered 2/2, 2018 at 23:26 Comment(3)
Note: tuple of listcomp requires a peak memory usage based on the combined size of the final tuple and list. tuple of a genexpr, while slower, does mean you only pay for the final tuple, no temporary list (the genexpr itself occupying roughly fixed memory). Usually not meaningful, but it can be important when the sizes involved are huge.Mesonephros
Very informative. Tuple from a generator would not be the best choice in this case. I think tuple([i for i in range(1000)]) is the best in terms of readability and speed. Though ofc, not sure of the timings on smaller / bigger / different datasetsLiriodendron
when I tried tuple from list comprehension v/s tuple from generator with bigger data (roughly say range(1_000_000)) you'll see tuple from generator will take less time although it's not so significant but you'll end up saving both size and time in case of bigger dataSheen
E
37

Comprehension works by looping or iterating over items and assigning them into a container, a Tuple is unable to receive assignments.

Once a Tuple is created, it can not be appended to, extended, or assigned to. The only way to modify a Tuple is if one of its objects can itself be assigned to (is a non-tuple container). Because the Tuple is only holding a reference to that kind of object.

Also - a tuple has its own constructor tuple() which you can give any iterator. Which means that to create a tuple, you could do:

tuple(i for i in (1,2,3))
Exorbitance answered 5/6, 2013 at 12:47 Comment(5)
In some ways I agree (about it not being necessary because a list will do), but in other ways I disagree (about the reasoning being because it's immutable). In some ways, it makes more sense to have a comprehension for immutable objects. who does lst = [x for x in ...]; x.append()?Legist
@Legist I am not sure how that relates to what I said?Exorbitance
@Legist if a tuple is immutable that means the underlying implementation cannot "generate" a tuple ("generation" implying building one piece at a time). immutable means you can't build the one with 4 pieces by altering the one with 3 pieces. instead, you implement tuple "generation" by building a list, something designed for generation, then build the tuple as a last step, and discard the list. The language reflects this reality. Think of tuples as C structs.Epilimnion
although it would be reasonable for the syntactic sugar of comprehensions to work for tuples, since you cannot use the tuple until the comprehension is returned. Effectively it does not act like mutable, rather a tuple comprehension could behave much like string appending.Eeg
What you do when (1,2,3) isn't easy enough.Monger
L
16

My best guess is that they ran out of brackets and didn't think it would be useful enough to warrent adding an "ugly" syntax ...

Legist answered 5/6, 2013 at 12:45 Comment(12)
Angle brackets unused.Eeg
@Eeg -- Not completely. They're used for comparison operators. It could probably still be done without ambiguity, but maybe not worth the effort ...Legist
No, that is very different. The language grammar is very specific about those as tokens. They have a very clear semantic and lexical scope that would be unambiguous (or nearly enough to make it work with minor changes) if also applied to new tokens like exist for the other brackets. Parens are already used to delineate scope of lots of different things, so it should be very very doable. Were it only that somebody decided to. Honestly, sets also need some literal love and comprehensions should get sigils.Eeg
@Eeg Worth noting that {*()}, though ugly, works as an empty set literal!Superelevation
Absolutely HIDEOUS. I like it, but only because it is sick obscure stuff. Nobody should ever use that. Awesome.Eeg
What version of python is that supposed to work in ?Eeg
@M.I.Wright, I'd call that the Cyclops (sideways). Does it have a name?Moselle
@QuantumMechanic Nope, no common name -- likely because it's not often used (and shouldn't be at all used!). From a purely-aesthetic standpoint, though, I admit I'm somewhat partial now to {*''}Superelevation
Ugh. From an aesthetic standpoint, I think I'm partial to set() :)Legist
@QuantumMechanic: I came up with {*()} almost immediately after PEP 448 came out, and I've been calling it the one-eyed monkey operator. I doubt it's the only name people have come up with.Mesonephros
@ShadowRanger: It turns out that these all evaluate to the empty set: {*''}, {*""}, {*()}, {*[]}, {*{}}. So inadvertently, TIMTOWTDI. I guess I like {*[]} for it's appearance as a posh Letterbox in Kent.Moselle
@QuantumMechanic: Yeah, that's the point; the unpacking generalizations made the empty "set literal" possible. Note that {*[]} is strictly inferior to the other options; the empty string and empty tuple, being immutable, are singletons, so no temporary is needed to construct the empty set. By contrast, the empty list is not a singleton, so you actually have to construct it, use it to build the set, then destroy it, losing whatever trivial performance advantage the one-eyed monkey operator provides.Mesonephros
S
14

Tuples cannot efficiently be appended like a list.

So a tuple comprehension would need to use a list internally and then convert to a tuple.

That would be the same as what you do now : tuple( [ comprehension ] )

Sawmill answered 15/7, 2015 at 21:33 Comment(0)
Y
3

Parentheses do not create a tuple. aka one = (two) is not a tuple. The only way around is either one = (two,) or one = tuple(two). So a solution is:

tuple(i for i in myothertupleorlistordict) 
Yeti answered 3/8, 2017 at 10:23 Comment(1)
one = (two,) and one = tuple(two) do not evaluate to the same value. The argument to tuple must be an iterator. one = (two,) is equivalent with one = tuple(i for i in two), one = tuple((two,)), and one = tuple([two]).Lipscomb
L
1

I believe it's simply for the sake of clarity, we do not want to clutter the language with too many different symbols. Also a tuple comprehension is never necessary, a list can just be used instead with negligible speed differences, unlike a dict comprehension as opposed to a list comprehension.

Liriodendron answered 5/6, 2013 at 12:47 Comment(2)
"Also a tuple comprehension is never necessary, a list can just be used instead with negligible speed differences" Calling C++ libraries with a list instead of a tuple may return an error. However it's not that difficult to convert the list into a tuple by tuple(list)Boulevardier
@Boulevardier That appears to be the best option you can choose from here https://mcmap.net/q/73636/-why-is-there-no-tuple-comprehension-in-python based on timingLiriodendron
N
0

On my python (3.5) using a generator with deque from collections is slightly quicker then using a list comprehension:

>>> from collections import deque
>>> timeit.timeit(lambda: tuple([i for i in range(10000000)]),number=10)
9.294099200000005
>>> timeit.timeit(lambda: tuple(deque((i for i in range(10000000)))),number=10)
9.007653800000014
Nitrosamine answered 24/5, 2021 at 19:42 Comment(4)
I did not see the speed advantage. Did you try to repeat the timing multiple times? The results can vary considerably on repeated runs.Outguard
I just checked on python 3.5 and I could reproduce it. But this might be different for other python versions. It seams to be somehow plausible because deque does not need the index related overhead a list has.Nitrosamine
Just for information: in Python 3.10.4 I am getting values around 6 for the list variant, 8 for the deque variant. The variation between individual runs is bigger than the 0.3 seconds difference between your results. I am running it inside WSL2 and the virtualization can possibly cause the large variation.Outguard
I rechecked now with python 3.9.2 on same computer like before and I got: First case: 5.848282200000003 and second case: 6.6902867000000015 I guess the implementation related to list is more improved compared with deque This means my statement is only valid for older python versionsNitrosamine
G
0

Because you can not append items to a tuple. This is how a simple list comprehension can be converted into more basic python code.

_list = [1,2,3,4,5]
clist = [ i*i for i in _list ]
print(clist)

clist1 = []
for i in _list:
    clist1.append(i*i)
print(clist1)

Now using a tuple comprehension for above example means appending items into a tuple which is not allowed. Though you can covert this list to a tuple once it is ready by using tuple(clist1)

Giddens answered 20/11, 2021 at 18:6 Comment(0)
L
-1

Well there is tuple comprehension in python3 now. You can follow below code snippet.

(k*k for k in range(1,n+1)) 

it will return a generator object comprehension.

Landscape answered 10/9, 2022 at 16:37 Comment(1)
That is not correct. This does not result in a tuple, but in a generator. If you try this type( (k*k for k in range(1,11)) ), you will see that it returns <class 'generator'>Sjoberg
M
-5

We can generate tuples from a list comprehension. The following one adds two numbers sequentially into a tuple and gives a list from numbers 0-9.

>>> print k
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
>>> r= [tuple(k[i:i+2]) for i in xrange(10) if not i%2]
>>> print r
[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
Munos answered 3/6, 2016 at 0:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.