(Yet Another) List Aliasing Conundrum
Asked Answered
P

6

5

I thought I had the whole list alias thing figured out, but then I came across this:

    l = [1, 2, 3, 4]
    for i in l:
        i = 0
    print(l)

which results in:

    [1, 2, 3, 4]

So far so good.

However, when I tried this:

    l = [[1, 2], [3, 4], [5, 6]]
    for i in l:
        i[0] = 0

I get

    [[0, 2], [0, 4], [0, 5]]

Why is this?

Does this have to do with how deep aliasing goes?

Phyto answered 25/1, 2012 at 3:45 Comment(5)
there is no such thing as "aliasing" in pythonPeace
@user102008: If I define a function f, and then set some variable to equal f (i.e. x = f). I can then call f by using x(). That sure sounds like aliasing to me. If I understand correctly, using the assignment x = f makes x refers to f, it doesn't create an instance of f?Phyto
correct. f is created at def time. x = f just points another name at f.Belorussia
But that's not aliasing, that's just copying the reference. Rebinding f doesn't affect x after the fact.Roush
It's strange how "yet another" is usually "the same one as always", in disguise.Rhodarhodamine
B
6

i = 0 is very different from i[0] = 0.

Ignacio has explained the reasons, succinctly and correctly. So I'll just try to explain in a more simple words what's actually going on here.

In the first case, i is just a label pointing at some object (one of the members in your list). i = 0 changes the reference to some other object, so that i now references the integer 0. The list is unmodified, because you never asked to modify l[0] or any element of l, you only modified i.

In the second case, i is also just a name pointing at one of the members in your list. That part is no different. However, i[0] is now calling .__getitem__(0) on one of the list members. Similarly, i[0] = 'other' would be like doing i.__setitem__(0, 'other'). It is not simply pointing i at a different object, as a regular assignment statement would, actually it's mutating the object i.

An easy way to think of it is always that names in Python are just labels for objects. A scope or namespace is just like a dict mapping names to objects.

Belorussia answered 25/1, 2012 at 4:1 Comment(4)
So, if I understand this correctly, in this case i is treated differently based on whether it is indexed or not?Phyto
@Joel: i is never indexed. The object i is bound to is indexed.Roush
i is just a name in the namespace. i = 0 assigns that name i to a different object, and i[0] = 0 does not.Belorussia
@Joel: It's not that i is treated differently, but that i[0] = 0 is a different operation from i = 0. The first just makes i be a name for 0. The second makes the 0th slot in whatever i is currently a name for refer to 0. i[0] = 0 would have the same effect if i was bound to a number, except that numbers don't contain any "slots", so the operation throws an error.Botanize
R
9

The first rebinds the name. Rebinding a name changes only the local name. The second mutates the object. Mutating an object changes it everywhere it is referenced (since it's always the same object).

Roush answered 25/1, 2012 at 3:47 Comment(0)
B
6

i = 0 is very different from i[0] = 0.

Ignacio has explained the reasons, succinctly and correctly. So I'll just try to explain in a more simple words what's actually going on here.

In the first case, i is just a label pointing at some object (one of the members in your list). i = 0 changes the reference to some other object, so that i now references the integer 0. The list is unmodified, because you never asked to modify l[0] or any element of l, you only modified i.

In the second case, i is also just a name pointing at one of the members in your list. That part is no different. However, i[0] is now calling .__getitem__(0) on one of the list members. Similarly, i[0] = 'other' would be like doing i.__setitem__(0, 'other'). It is not simply pointing i at a different object, as a regular assignment statement would, actually it's mutating the object i.

An easy way to think of it is always that names in Python are just labels for objects. A scope or namespace is just like a dict mapping names to objects.

Belorussia answered 25/1, 2012 at 4:1 Comment(4)
So, if I understand this correctly, in this case i is treated differently based on whether it is indexed or not?Phyto
@Joel: i is never indexed. The object i is bound to is indexed.Roush
i is just a name in the namespace. i = 0 assigns that name i to a different object, and i[0] = 0 does not.Belorussia
@Joel: It's not that i is treated differently, but that i[0] = 0 is a different operation from i = 0. The first just makes i be a name for 0. The second makes the 0th slot in whatever i is currently a name for refer to 0. i[0] = 0 would have the same effect if i was bound to a number, except that numbers don't contain any "slots", so the operation throws an error.Botanize
R
2
for i in l:

This means "each time through the loop, i shall be a name for the next element of l".

i = 0

This means "i shall cease to be a name for whatever it's currently a name for, and begin to be a name for the integer object 0".

i[0] = 0

This means "The zeroth element of the thing that i names shall be replaced with 0". (You can't really say "i[0] shall cease to be a name for..." because it isn't a name.)

Rhodarhodamine answered 25/1, 2012 at 8:13 Comment(0)
B
1

Assignment is always just rebinding names in Python, which means you end up with aliases for the same object all over the place. Any time you give a name to any kind of object (you assign an existing object a new name, or you receive it passed into a function, or you pull it out of some container), anything you do to actually alter this object will affect it wherever it originally was (i.e. the caller who passed it into your function, or anybody else looking at the container you pulled it out of).

You can make your code explicitly take copies of things, if you need it to; list slicing with mylist[:] is probably the way you're most likely to be familiar with. Many built in operations do this; in particular it's usually a safe assumption with builtin functions/methods that if they return an object they haven't modified the originals (and this is a very good rule to follow most of the time; if your function or method has its effect by altering existing objects, it should return None and let the callers just look at the objects they gave you). In fact particularly for lists there are lots of pairs of methods/functions that do the same thing; usually there's a function that returns a new modified copy of the list and a method that returns nothing but alters the list. e.g. compare sorted_list = sorted(mylist) vs mylist.sort(), reversed_list = reversed(mylist) vs mylist.reverse(), etc.

But in the case when copying is going on, you do need to be careful about how deep the copy goes; in most cases it's only to the outer-most level, so mutating anything contained inside the copy will be visible from the original object.

New Python programmers need to get a handle on this as early as possible, as it pervades every single aspect of programming in Python.

Unfortunately, this issue is obscured by the most natural staring point for new programmers. You start by figuring out how to do basic manipulations of numbers and text strings, but those are immutable in Python. That doesn't mean they don't work "the same way" as lists, it just means there are no operations you can do that will cause them to change. So you don't need to care about whether they're being shared, because it won't make a difference even if they are.

Other answers have explained in more detail exactly what's going on in your example. But whenever you modify a list (or any other object), you need to think "where did this list come from?". Because most of the time, unless it's a new list that's just been created, other parts of your program will be able to see the changes you make. There is nearly always aliasing going on with list manipulations. Much of the time it doesn't matter, especially for lists of numbers or text. But you need to be aware of it all the time, so that you can decide whether or not it matters.

It does become second nature fairly easily though, so persevere and it'll get a lot easier. Having said that, even very experienced Python programmers still get caught by the occasional aliasing bug!

Botanize answered 25/1, 2012 at 4:28 Comment(2)
Is there a limit to how deep deepcopy goes?Phyto
deepcopy tries to go all the way; it is your antimatter-bomb in the war against aliasing. It can't work on absolutely every possible object through, and almost all of the time you can write your programs so that you don't have to deepcopy everything. But for any object you're likely to be able to construct as a beginner Python programmer, deepcopy will work fine, and will give you back an object with no relevant aliasing.Botanize
H
0

When you say i = 0 you assign a new value to the variable i.

When you say i[0] = 0 you modify the list in variable i, setting the first element to a new value. Since the list in i is an element of l, l ends up with modified elements.

Hermaphroditism answered 25/1, 2012 at 4:0 Comment(0)
A
0

Sometimes I find it easier to think in the less abstract world of C. In Python, think of every variable as being a pointer. When you do i = 0; i = 1, you're not doing this:

int * i = malloc(sizeof(int));
*i = 0;
*i = 1;

It's more like this:

int * i = malloc(sizeof(int));
*i = 0;
free(i);
i = malloc(sizeof(int));
*i = 1;

You're not changing the value, you're moving the pointer to a new value.

With a list though, the l pointer stays the same, but you're updating the value at the specified index. Thus l = [0]; l[0] = 1 looks like:

int * l[] = {0};
l[0] = 1;

(Note: I realize Python ints and lists are not really C ints and arrays, but for this purpose, they're similar.)

Protip: "alias" is not a Python term, so please avoid it. "Reference" or just "variable" are better.

Archduchess answered 26/1, 2012 at 17:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.