Internals for python tuples [duplicate]
Asked Answered
V

2

11
>>> a=1
>>> b=1
>>> id(a)
140472563599848
>>> id(b)
140472563599848
>>> x=()
>>> y=()
>>> id(x)
4298207312
>>> id(y)
4298207312
>>> x1=(1)
>>> x2=(1)
>>> id(x1)
140472563599848
>>> id(x2)
140472563599848

until this point I was thinking there will be only one copy of immutable object and that will be shared(pointed) by all the variables.

But when I tried, the below steps I understood that I was wrong.

>>> x1=(1,5)
>>> y1=(1,5)
>>> id(x1)
4299267248
>>> id(y1)
4299267320

can anyone please explain me the internals?

Vtol answered 13/12, 2014 at 14:6 Comment(2)
Note, this is an implementation detail, not something that can be depended on. And building constants is one thing while something dyamic like tuple(range(3)) several times is not interned and you get separate objects. You only get interning when the compiler can figure out the value.Summons
x1=(1) does not make a tuple.Consume
F
13
>>> x1=(1)
>>> x2=(1)

is actually the same as

>>> x1=1
>>> x2=1

In Python, smaller numbers are internally cached. So they will not be created in the memory multiple times. That is why ids of x1 and x2 are the same till this point.

An one element tuple should have a comma at the end, like this

>>> x1=(1,)
>>> x2=(1,)

When you do this, there are two new tuples to be constructed with only one element in it. Even though the elements inside the tuples are the same, they both are different tuples. That is why they both have different ids.

Lets take your last example and disassemble the code.

compiled_code = compile("x1 = (1, 5); y1 = (1, 5)", "string", "exec")

Now,

import dis
dis.dis(compiled_code)

would produce something like this

  1           0 LOAD_CONST               3 ((1, 5))
              3 STORE_NAME               0 (x1)
              6 LOAD_CONST               4 ((1, 5))
              9 STORE_NAME               1 (y1)
             12 LOAD_CONST               2 (None)
             15 RETURN_VALUE

It loads a constant value, referred by the index 3, which is (1, 5) and then stores it in x1. The same way, it loads another constant value, at index 4 and stores it in y1. If we look at the list of constants in the code object,

print(compiled_code.co_consts)

will give

(1, 5, None, (1, 5), (1, 5))

The elements at positions 3 and 4 are the tuples which we created in the actual code. So, Python doesn't create only one instance for every immutable object, always. Its an implementation detail which we don't have to worry much about anyway.

Note: If you want to have only one instance of an immutable object, you can manually do it like this

x1 = (1, 5)
x2 = x1

Now, both x2 and x1 will refer the same tuple object.

Fredericfrederica answered 13/12, 2014 at 14:14 Comment(3)
@JonClements Included that in the answer now :-)Fredericfrederica
I'd like to mention that this behavior has actually changed (as of Python 3.9, although I'm not sure in which version exactly it changed). Now it seems that the Python compiler tries to reuse tuples for code that is "compiled together" (not sure how exactly that works), so if you enter x = (1, 5); y = (1, 5) (as one line) in the interactive shell, x is y will in fact be true. However, if you enter x = (1, 5) and y = (1, 5) as two lines in the interactive shell, then they will still have different ids (unlike small integers and strings).Sammiesammons
In my 3.10, compiled_code.co_consts is ((1, 5), None) - the tuple was interned. In fact, if you do x1=(1,), x2=(1,) in a function in the shell, or anywhere in a .py file, it'll be interned as a single tuple. I think that just highlights how this is just an implementation detail and can change whenever.Summons
B
2

As of Python3.7 tuple creation uses interning.

t1 = (1, 2, 3)
t2 = (1, 2, 3)
print(t2 is t1)

The above code prints True for Python version 3.7 and above.

But in the Python shell, it behaves differently; tuples don't reference to single object like string literals.

>>> t1 = (1, 2, 3)
>>> t2 = (1, 2, 3)
>>> t2 is t1
False
Bouse answered 17/9, 2023 at 1:56 Comment(2)
It will depend on how its compiled - do it in the shell / repl, and you'll get two different objects because each line is compiled separately. Do it in a .py file or a function in the repl, they'll be the same.Summons
So, this specific behavior is probably is due to the improvements of constant folding in 3.7, the removing of the peep-hole bytecode optimizer with an optimizer that works on the AST instead.Graben

© 2022 - 2024 — McMap. All rights reserved.