How does Python interpreter work in dynamic typing?
Asked Answered
E

3

9

I read this question, but it didn't give me a clear answer: How does Python interpreter look for types?

How does python interpreter know the type of a variable? I'm not looking how do get the type. I'm here looking at what happens behind the scene. In the example below, how does it associate the class int or string to my variable.

How does it know that is an int:

>>> i = 123
>>> type(i) 
<class 'int'>

or that string:

>>> i = "123"
>>> type(i)
<class 'str'>
European answered 14/7, 2016 at 6:21 Comment(4)
@GreenAsJade: The OP is using Python 3, where the representation for type objects uses 'class', not 'type'; this was done to reflect that C-defined types are just classes too.Mavis
@MartijnPieters maybe needs a python3 tag then?Allopath
@GreenAsJade: no, the answer is the same in Python 2 and 3. But the output provided did not need correcting.Mavis
Ah OK. But the string example did :)Allopath
M
12

how does it associate the class int or string to my variable

Python doesn't. Variables have no type. Only the object that a variable references has a type. Variables are simply names pointing to objects.

For example, the following also shows the type of an object, but no variable is involved:

>>> type(1)
<class 'int'>
>>> type('foobar')
<class 'str'>

When you use type(variable), the variable part of the expression simply returns the object that name references, passing in the object to the type() function. When using 1 or 'foobar', the expression is a literal producing the object, which is then passed to the type() function.

Python objects are simply datastructures in the interpreter memory; in CPython C structs are used. Variables are merely references (pointers) to those structures. The basic type struct in CPython is called PyObject, and this struct has a ob_type slot that tells Python what type something is. Types are simply more C structures.

If you wanted to follow along in the CPython source code, you'd start at the bltinmodule.c source code (since type is a built-in name), which defines type as the PyType_Type structure. Calling a type (type is a type too) invokes their tp_new function, and PyType_Type defines that as the type_new function. This function handles calls with one argument as follows:

/* Special case: type(x) should return x->ob_type */
{
    const Py_ssize_t nargs = PyTuple_GET_SIZE(args);
    const Py_ssize_t nkwds = kwds == NULL ? 0 : PyDict_Size(kwds);

    if (PyType_CheckExact(metatype) && nargs == 1 && nkwds == 0) {
        PyObject *x = PyTuple_GET_ITEM(args, 0);
        Py_INCREF(Py_TYPE(x));
        return (PyObject *) Py_TYPE(x);
    }

Here x is the PyObject object you passed in; note, not a variable, but an object! So for your 1 integer object or 'foobar' string object, the Py_TYPE() macro result is returned. Py_TYPE is a macro that simply returns the ob_type value of any PyObject struct.

So now you have the type object for either 1 or 'foobar'; how come you see <class 'int'> or <class 'str'> in your interpreter session? The Python interactive interpreter automatically uses the repr() function on any expression results. In the C structure for PyType_Type definitions the PyType_Type struct is incorporated so all the slots for that type are directly available; I'll omit here exactly how that works. For type objects, using repr() means the type_repr function is called which returns this:

rtn = PyUnicode_FromFormat("<class '%s'>", type->tp_name);

So in the end, type(1) gets the ->ob_type slot, (which turns out to be the PyLong_Type struct in Python 3, long story), and that structure has a tp_name slot set to "int".

TL;DR: Python variables have no type, they are simply pointers to objects. Objects have types, and the Python interpreter will follow a series of indirect references to reach the type name to print if you are echoing the object in your interpreter.

Mavis answered 14/7, 2016 at 6:38 Comment(3)
Thank you for your answer Martijn. Just to clarify, at what point is type_repr called? After return (PyObject *) Py_TYPE(x);?Cyprus
@user51462: when you use repr(object), the object is introspected to see if there is a hook to implement representation, and that leads to type_repr. The return (PyObject *) Py_TYPE(x); part is not involved, that's what is used when you call type(object).Mavis
Ah I see, so it's not at all part of type(object). Thank you so much for your prompt reply Martijn, I've been stuck on this.Cyprus
S
2

Python variables have no type, they are just references to objects. The size of a reference is the same regardless of what it is referring to. In the C implementation of Python it is a pointer, and does have a type, it a pointer to a Python object: PyObject *. The pointer is the same type regardless of class of object. Objects, on the other hand, know which class they belong to.

It has been argued that Python has no variables, only names, although that's a step too far for most people.

References in the CPython implementation have an id (identifier) which is actually a virtual address. The detail and value of this address is not worth pursuing - it can (and probably will) change between versions and is not meant to be used for anything other than a unique number identifying the object. Nevertheless it can provide interesting pointers (pardon the pun) to what is happening:

>>> x = 42
>>> y = x
>>> id(x)
4297539264
>>> id(y)
4297539264

Note that the id (address) of x and y are the same - they are referencing the same object, an int with the value 42. So, what happens when we change x, does y change as well?

>>> x = "hello"
>>> id(x)
4324832176
>>> id(y)
4297539264

Thankfully not. Now x is just referring to a new object of class str with the value "Hello".

When we:

>>> id(y)
4297539264
>>> y = 37
>>> id(y)
4297539104 

The id of y changed! This is because it is now referencing a different object. ints are immutable, so the assignment y = 37 did not change the original object (42) it created a new one. The object with the value 42 has its reference count decremented and can now (in theory) be deleted. In practice it would probably remain in memory for efficiency reason, but that's an implementation detail.

However, let's try this for a list:

>>> a = [1,2,3,4]
>>> b = a
>>> id(a)
4324804808
>>> id(b)
4324804808
>>> a[0] = 99
>>> b
[99, 2, 3, 4]

So changing the list a has changed b! This is because lists in Python (unlike, say in R) are mutable, so they can change in-place. The assignment b = a only copied the reference and thus saved memory (no data was actually copied). Dictionaries are another object with such behavior. See copy in the standard library.

Salute answered 14/7, 2016 at 7:8 Comment(0)
A
0

The concept "type" of a variable is "implemented" by using objects of a specific class.

So in

a=float()

an object of type float, as defined by the class float is returned by float(). Python knows what type it is because that's how objects work: you know what type they are. a is now a float object, with value 0.0.

With builtins, it's the same, it's just that they have shorthands for declaring them.

i=123

is the same as

i=int(123)

int() returns an object of class integer, with value 123.

similarly

i="123"

is the same as

i=str("123")

str("123") returns an object of class str, with value "123"

Allopath answered 14/7, 2016 at 6:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.