It is my understanding that the range()
function, which is actually an object type in Python 3, generates its contents on the fly, similar to a generator.
This being the case, I would have expected the following line to take an inordinate amount of time because, in order to determine whether 1 quadrillion is in the range, a quadrillion values would have to be generated:
1_000_000_000_000_000 in range(1_000_000_000_000_001)
Furthermore: it seems that no matter how many zeroes I add on, the calculation more or less takes the same amount of time (basically instantaneous).
I have also tried things like this, but the calculation is still almost instant:
# count by tens
1_000_000_000_000_000_000_000 in range(0,1_000_000_000_000_000_000_001,10)
If I try to implement my own range function, the result is not so nice!
def my_crappy_range(N):
i = 0
while i < N:
yield i
i += 1
return
What is the range()
object doing under the hood that makes it so fast?
Martijn Pieters's answer was chosen for its completeness, but also see abarnert's first answer for a good discussion of what it means for range
to be a full-fledged sequence in Python 3, and some information/warning regarding potential inconsistency for __contains__
function optimization across Python implementations. abarnert's other answer goes into some more detail and provides links for those interested in the history behind the optimization in Python 3 (and lack of optimization of xrange
in Python 2). Answers by poke and by wim provide the relevant C source code and explanations for those who are interested.
bool
orlong
type, with other object types it will go crazy. Try with:100000000000000.0 in range(1000000000000001)
– Densmore__contains__
method. – Curium2 in r
in Python 3 (but not Python 2), someone challenged me to find that in the docs, and it's not there, and when I asked (on python-ideas or the bug tracker? I forget…) whether it should be guaranteed, nobody seemed to have much interest in answering one way or the other until there was another Python 3 implementor to talk to. – Curiumxrange
the same as Python3range
? – Terrrange
(and presumablyxrange
) has never been specced anywhere, so the specific details - including how__contains__
,__item__
, etc, are calculated, is up to the different language implementors? So it's possiblexrange
was implemented differently in the past. I'm just guessing here. – Infeasiblerange.__contains__
optimization wasn't backported to 2.6xrange.__contains__
(the 2.x C API hadsq_contains
in thetp_as_sequence
struct, etc., so there's no reason it couldn't have been…), but for that, you'd probably have to track down the hg change where the optimization was added to the py3k branch, and see if there's a corresponding bug and/or thread… – Curiumxrange()
objects have no__contains__
method, so the item check has to loop through all the items. Plus there are few other changes inrange()
, like it supports slicing(which again returns arange
object) and now also hascount
andindex
methods to make it compatible withcollections.Sequence
ABC. – Densmorerange.__contains__
method implemented in pure Python – Punctuatein
keyword has (at least) two different meanings in Python – Hastin
can be a signal for starting iteration, or it can be a signal for testing membership. – Infeasible1_000_000_000_000_000 in list(range(1_000_000_000_000_001))
causes memory error, obviously. – Chemise