Mutability and Immutability in Python

Understanding Mutability and Immutability in Python
Introduction
When learning new things, it helps if you can connect what you are learning to things that you already know. You appreciate what you are learning on a deeper level, and the process is more satisfying than remembering arbitrary propositions.
In programming, many things may initially seem arbitrary. It is only when you understand on a deeper level why those seemingly arbitrary rules and standards exist that they start to make sense.
One concept that I think offers a ‘bang for its buck’ when gaining a great appreciation of how Python works is that of mutability vs immutability. In this article, I’ll discuss this seemingly simple topic, and explain how having some understanding of it can aid in making other aspects of the language more intuitive.
First, let’s define what mutability actually is (which is not as simple as it might seem).
What is Mutability?
Maybe you have heard of these concepts before, and think that mutability simply means the ability of an object to change its value. While this is technically correct, there is some nuance.
When defining what mutability/immutability is, where better to look than the official Python documentation itself, which states:
The value of some objects can change. Objects whose value can change are said to be mutable; objects whose value is unchangeable once they are created are called immutable. (The value of an immutable container object that contains a reference to a mutable object can change when the latter’s value is changed; however the container is still considered immutable, because the collection of objects it contains cannot be changed. So, immutability is not strictly the same as having an unchangeable value, it is more subtle.) An object’s mutability is determined by its type; for instance, numbers, strings and tuples are immutable, while dictionaries and lists are mutable.
As expected, the core issue is whether objects can change their value.
Before moving on to exploring the latter part of the definition from Python’s documentation and its implications, let’s take a brief detour to outline some properties of objects in Python.
The Three Properties of all Python Objects
Every object in Python has these three properties:
- Identity
- Type
- Value
The identity of an object defines that object. That is to say, if we were to ask whether object a is the same as object b, then they would only be the same if and only if they had the same id. You can check the ID of any object in Python using the built-in id() function. (In CPython, this returns the memory address of the object).
The type of an object defines what kind of object it is, defining its behavior and the methods it supports. You can check the type of any object in python using the built-in type() function.
Lastly, while the ‘value’ of an object seems obvious, this topic also seems quite complex to me (which I’ll discuss more later), but for now we can think of it in the standard sense. For example, with the statement a = 5, a is an object of type integer with a value of 5.
Now, given this, the definition of mutability makes a lot of sense: given an object, which is defined by a certain id, mutability defines the ability for that value of an object to change. Moreover, certain types are mutable (for instance, lists, dictionaries), while others are not (strings, tuples).
Makes sense.
However, there is another point made, specifically:
The value of an immutable container object that contains a reference to a mutable object can change when the latter’s value is changed; however the container is still considered immutable, because the collection of objects it contains cannot be changed. So, immutability is not strictly the same as having an unchangeable value, it is more subtle.
To illustrate what this means, let’s look at tuples in Python, which, as the excerpt states, are immutable.
Say we have a tuple
T = (1, 2, 3)
Since tuples are immutable, as we expect, we cannot change the value of this tuple. However, with a mutable type such as a list:
lst = (1, 2, 3)
We are free to change its value through the .append() method for example. If we check the id of the list before and after, we can see that it is still the same list. However, what if we had a tuple containing a list:
T = (1, 2, [1, 2])
Because tuples are immutable, we would expect that we shouldn’t be able to change its value. However, if we do:
T[2].append(3)
We get
T = (1, 2, [1, 2, 3])
And its value has changed! Indeed we can use the id(T) function to check that it really is the same tuple. So it seems that even though tuples are immutable, we have changed their value.
This is what the point in the documentation is referencing. Since tuples are a container object that contains references to other objects, its value is not determined by what is printed out on the screen for example, but rather by the identity of the references it contains. When we modified the list within the tuple, we did not change the id of the list itself, and so we are not actually changing the ‘value’ of the tuple.
You can see this better if you print out the ids of all of the objects within a tuple - this representation is the ‘real’ value of the tuple, and since these never change (even though the referenced objects may change), the tuple is immutable.
Therefore, when we say that ‘mutable objects can change their value while immutable objects cannot’ we need to keep in mind here that ‘value’ in the case of container type objects refers to the value of the referenced items, and not the literal values printed to the screen for example.
Now that we understand mutability on a sufficient level, let’s move on to some consequences that arise from this feature.
A Short Interlude on 'Value'
*When we say that an item cannot change its value, what do we mean by value? Well, Python has an operator designed for checking value equality, which is ‘==’. So, if two items, say a and b, have the same value, then a==b should return True, and otherwise it should return False. This might seem like a reasonable definition of value, but there are complications. The problem here though is that the eq dunder method that implements the ‘==’ operator can be overloaded, which seems to cause issues for this definition of ‘value’. *
Better Educated Guesses
Understanding mutable vs immutable types can help you better predict whether a functionality should be implemented using a function or a method. For example:
l = [3, 1, 2] l.sort() print(l) # [1, 2, 3]
Strings, being immutable, do not have an in-place .sort() method, which is why you should use sorted() instead.
Hashing
When you hash strings and some other hashable objects in Python, you may get different results across different runtimes due to hash randomization for security reasons.
However, For hashing to function correctly, the hash value of an object must remain consistent within a single runtime session. In other words, if our program requires ‘a’ to be hashed, and then later also requires ‘a’ to be hashed, the output in both cases should be the same.
As you can imagine, mutable objects cause issues here, since between the first and second hashing instances, we might mutate the object, causing its hashing value to change. This provides some understanding as to why you get an error when trying to add a list, for example, to a set (resulting in TypeError: unhashable type: 'list').
Therefore, if you ever need to add a list to a dictionary, set or similar, you can convert it to a tuple before hashing it, since in the majority of cases (see below for an exception), tuples are hashable.
However, as we now know, this doesn’t mean that all immutable types are hashable. As we previously saw, tuples are immutable; however, they can still maintain references to mutable types, such as lists. Since the referenced lists can mutate, such tuples may produce different hash values across the runtime of a program, and hence cannot be hashed.
Interning
The idea of interning in Python can be observed by looking at this little experiment:
a = [1, 2, 3] b = [1, 2, 3] print(id(a)) # Shows 139927486566976 print(id(b)) # Shows 139927486565184 - Different a = (1, 2, 3) b = (1, 2, 3) print(id(a)) # Shows 140126395242112 print(id(b)) # Shows 140126395242112 - Identical!
The idea here is that when we create two lists, Python will create a new instance in each case. This makes sense as list ‘a’ can mutate, in which case it will be different from list b, and so we should of course have two copies of the list [1,2,3].
However, in the case of the two tuples, we can see that these are actually the same object in memory, to which the references ‘a’ and ‘b’ both point.
The reason Python does this is that it understands that this tuple can never change its value (unlike the previously considered lists), and so it’s okay to ‘reuse the same object’.
This behavior isn’t unique to tuples. Python also uses interning for strings, which makes sense since strings are immutable.
s = "hello" b = "hello" print(id(s)) # Shows 139637612231104 print(id(b)) # Shows 139637612231104 - Identical!
Can you guess what would be printed with this snippet?
a = (1, 2, [1]) b = (1, 2, [1]) print(id(a)) print(id(b))
For reasons previously stated, we would expect the ids to be different in this case.
The process of interning allows Python to be more efficient by reusing immutable objects. I believe Python stores integers from -5 to 255 or so in memory ‘by default’ and reuses them, as such integers are so frequently used that Python’s developers thought that it would intern them ‘out of the box’ so to speak to optimize memory usage and performance.
Care with Immutable Types
One should take care with ‘duplicating’ mutable types, as sometimes the copies seem distinct, when in reality they reference the same object. Then, after modifying one ‘copy’, all instances seem to change. For example:
lst = [[]] * 5 print(lst) # [[], [], [], [], []] lst[0].append(1) print(lst) # [[1], [1], [1], [1], [1]]
When the 5 copies of the inner list are created, they all reference the same list object (see below), and so updating one of them actually updates all of them, which can cause issues.
print([id(x) for x in lst]) # [140670638508864, 140670638508864, 140670638508864, 140670638508864, 140670638508864]
A way to get around that would be:
lst = [[] for _ in range(5)]
Understanding references and mutability in Python can help one avoid such traps.
Conclusion
Overall, as I mentioned in the introduction, understanding mutability in Python is highly beneficial, and is more or less essential for developing a good grasp of the language’s inner workings. Hopefully some of the examples covered above give a good idea of the ways in which the consequences of mutable vs immutable types can be appreciated.