Variables are labels, not boxes. Take the following code for example:

1
2
3
4
a = [1, 2, 3]
b = a
a.append(4)
print(b)

If you imagine that variables are like boxes, you cannot make sense of assignment in Python. For an assignment, you must always read the right-hand side first: that’s where the object is created or retrieved. After that, the variable on the left is bound to the object, like a label stuck to it. Just forget about the boxes.

Since variables are mere labels, nothing prevents an object from having several labels assigned to it. When that happens, you have aliasing.

Identity, equality and aliases

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
charles = {'name': 'Charles', 'born': 1932}
levis = charles
print(levis is charles)
print(id(levis), id(charles))
levis['balance'] = 950
print(charles)

alex = {'name': 'Charles', 'born': 1932, 'balance': 950}
print(alex == charles)
print(alex is not charles)

charles and levis refer to the same object, while alex is bound to a separate object of equal contents. We call lewis and charles are aliases.

The id is guaranteed to be a unique numeric label and will not change during the life of the object. In practice, we rarely use id() function. Identity checks are most often done with the is operator, and not by comparing ids.

The == operator compares the values of objects and appears more frequently than is in Python code.

The is operator is fast than ==, because it cannot be overloaded, so Python does not have to find and invoke special methods to evaluate it.

Are tuples immutable?

Tuples, like most Python collections - lists, dicts, sets etc. - hold references to objects. If the referenced items are mutable, they may change even if the tuple itself does not.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
t1 = (1, 2, [3, 4])
t2 = (1, 2, [3, 4])
print(t1 == t2)

print(id(t1[-1]))
t1[-1].append(5)
print(t1)
print(id(t1[-1]))

print(t1 == t2)

The distinction between equality and identity has further implications when you need to copy an object. A copy is an equal object with a different id. But if an object contains other objects, it becomes more complicated.

Copies are shallow by default

The easiest way to copy a list is to use the built-in constructor for the type itself.

1
2
3
4
5
l1 = [3, [4, 5], [6,7,8]]
l2 = list(l1)
print(l2)
print(l1 == l2)
print(l1 is l2)

However, the constructor or [:] produces a shallow copy, i.e. the outermost container is duplicated, but the copy is filled with references to the same items held by the original container.

If there are mutable items, this lead to unpleasant surprises.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
l1 = [3, [4, 5], [6,7,8]]
l2 = list(l1)
l1.append(9)
l1[1].remove(5)
print('l1 ->', l1)
print('l2 ->', l2)

l2[1] += [22, 33]
l2[2] += [44, 55]
print('l1 ->', l1)
print('l2 ->', l2)

Deep copies of arbitrary objects

Sometimes we need to make deep copies, i.e. duplicates that do not share references of embedded objects.

The copy module provides the deepcopy and copy function that return deep and shallow copies of arbitrary objects.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Bus:
    def __init__(self, passengers=None):
        if passengers is None:
            self.passengers = []
        else:
            self.passengers = list(passengers)

    def pick(self, name):
        self.passengers.append(name)

    def drop(self, name):
        self.passengers.remove(name)

import copy
bus1 = Bus(['Alice', 'Bill', 'Claire', 'David'])
bus2 = copy.copy(bus1)
bus3 = copy.deepcopy(bus1)
print(id(bus1), id(bus2), id(bus3))

bus1.drop('Bill')
print(bus2.passengers)
print(bus3.passengers)
print(id(bus1.passengers), id(bus2.passengers), id(bus3.passengers))

If you want to control the behavior of both copy and deepcopy, implement the __copy__() and __deepcopy__() special methods.