Foot-guns of Python Development
What is a 'foot-gun'?
A foot-gun in software development is a feature within a system where the odds are that you'll end up shooting yourself in the foot with it.
Objects as default variables
Python allows you to not worry about the implications of what the code you're writing. This is one of its greatest features, and also one of it's biggest weaknesses as a language. One nicety of python is default argument in functions/methods. While these make code reuse effortless, this can lead to unexpected side effects. Let's start with an example:
from typing import List
class URLGroup:
def __init__(self, urls: List[str] = []):
self.urls: List[str] = urls
def add(self, url: str) -> None:
self.urls.append(url)
personal = URLGroup()
personal.add("www.lukeharwood.dev")
bookmarks = URLGroup()
bookmarks.add("www.google.com")
In this simple example we have an object who's __init__
function takes in a list of urls, and when a list is not provided it defaults to an empty list.
However, if we print out both groups we'll see the following:
output
After running this snippet, even though we only added "www.lukeharwood.dev"
to our personal
object, both urls are listed in both groups.
This is because the default argument, the empty list, is created at the definition time of the __init__
function,
so when we call the .add
function, we're adding the element to the same list across multiple objects.
To fix this, we should default to None
and create the default list at runtime if we need to use objects as default arguments:
from typing import List
class URLGroup:
def __init__(self, urls: List[str] | None = None):
self.urls: List[str] = urls if urls is not None else []
def add(self, url: str) -> None:
self.urls.append(url)
personal = URLGroup()
personal.add("www.lukeharwood.dev")
bookmarks = URLGroup()
bookmarks.add("www.google.com")
Truthy / Falsy Values
In python, booleans are not the only thing that can evaluate to True
/False
# integers
print(f"0 = {bool(0)}")
print(f"5 = {bool(5)}\n")
# strings
print(f"'' = {bool('')}")
print(f"'x' = {bool('x')}\n")
# other objects
print(f"None = {bool(None)}")
print(f"URLGroup() = {bool(URLGroup())}")
print(f"[] = {bool([])}")
print(f"[1, 2, 3] = {bool([1, 2, 3])}")
output
This is actually really helpful, since we can check if something is both empty and is not None:
from typing import List
def some_function(numbers: List[str] | None):
if numbers:
print(f"You have {len(numbers)} numbers...")
else:
print("Nothing here.")
x = [1, 2, 3]
some_function(x)
x = []
some_function(x)
x = None
some_function(x)
However, this can also backfire if we're not careful:
def get_text(number: int | None) -> str:
"""
If the number isn't None, return 'x = [number]',
otherwise, return 'x is None!'
"""
if number:
return f"x = {number}"
else:
return f"x is None!"
numbers = [None, 10, 0]
for number in numbers:
print(get_text(number))
In this case, 0 also evaluates to false, which introduces a bug (this same issue appears with empty lists). To fix this, just make a habit of explicitly checking for None
:
def get_text(number: int | None) -> str:
"""
If the number isn't None, return 'x = [number]',
otherwise, return 'x is None'
"""
if number is not None:
return f"x = {number}"
else:
return f"x is None!"
RegEx
While not unique to python, regular expressions are extremely easy to use in languages like python, javascript, and some others. This means that when solving a text processing task, they are sometimes the first tool pulled out of the bag. The fact that python gives you a regex implementation without needing a third party dependency is awesome, but it definitely shouldn't be the first place you look.
Far too often developers use a pattern they don't fully understand and too many times has a regex broke a production system. Keep it simple, and try to avoid this footgun. This could be a whole article in and of itself, but we'll leave it at that.
Crazy RegEx
They get complicated quickly. Take a look at the "correct answer" on this Stack Overflow topic for matching valid emails:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
Shadowing variables
While this topic isn't unique to python, it is extremely easy to fall for shadowing variables in python without caution.
In this case we actually just created a new variable named host
that shadows the global variable instead of editing the old one.
If we wanted to edit the global variable, we would have to use the global
keyword.
host = "127.0.0.1"
print(host)
def init_host():
global host
host = "localhost"
init_host()
print(host)
Always use caution when modifying global variables, or better yet avoid modifying them all together if at all possible. If you aren't trying to use a global variable, and instead want to create a new variable, ensure that you're using unique variable names to separate the different scoped variables.
Extra note
If we move the print inside the function we get a pretty confusing error:
host = "127.0.0.1"
def init_host():
# we moved the print inside this function
print(host)
host = "localhost" # if we comment this line out, it works though... why?
init_host()
print(host)
output
This is because we're defining a new variable host
in the function (not using the global host
),
and you're trying to use the variable before you created it...
While this might seem simple in a small snippet, in a large codebase these can be extremely difficult to debug.
Redeclaring variables with different types
In other programming languages, you are required to use specific syntax for defining a variables vs re-assignment. While at times it can look like cumbersome syntax, this also provides clarity as to what the type of the variable is intended to be.
Take Golang for instance:
This doesn't work
Go would also prevent you by changing the type of the variable within the same scope by not letting you do as that would give you another error:In python, they are one and the same and so the following code works just fine:
This makes it incredibly easy to reuse the same name for a variable, while thinking you are simply defining a new variable. Instead, you are simply assigning a new value to an existing variable.
In this case, you can use the same name with a different type and python will let you. While this can be intended (and indeed is helpful at times), you can also shoot yourself in the foot if you aren't careful.
Tuple syntax without parentheses
One bug that I've seen multiple times in real-world code is an accidental trailing comma after variable assignment. You might think this would result in an error, however, python doesn't force you to use parentheses for tuple declarations. This can be hard to find especially if the assignment isn't straight forward or is the result of a long multi-line expression.
In this snippet, we thought we were initializing a str
, but due to an accidental comma at the end, x
is a <class 'tuple'>
.
This is usually an accident, so telling you to "not do it" isn't very helpful.
Instead, just know this going forward if you're seeing an error/bug that could be explained by a tuple existing instead of the expected type, check the end of the line.
There's more
These are just a few python foot-guns that I've ran into while developing python, but there are many more, so beware.
P.S.
All code snippets were tested with python 3.11.