I recently started using Python in my day job for the first time. I’d had some exposure to it prior to this but not enough to form a strong opinion about it as a language. I initially picked it up for some CI/CD pipeline scripts, building a .NET Core microservices stack in Azure. Those scripts eventually evolved into a core application that is fundamental to the entire product.
Coming from a primarily C# background and picking up Python for application development can lead to some frustrating learning experiences and require some adjustments in mindset to get the most out of the language. It also raised some questions, in my mind, about what type of codebases Python is suitable for. In this post I’ll explore some of the tips, traps and opinions I’ve encountered working with the language, as an experienced developer coming from a strongly typed language background.
Note: This article will be referring to Python version 3.5 and up.
Types in Python
I will state up front that I recognise my opinion on types in Python is most likely heavily influenced by my background in strongly typed languages. Python is a dynamically typed language and while it has features to support stronger (strongish?) typing, they mostly seem bolted on and clunky to use in practice.
Type hints
Type hints are Python’s concession to stronger typing. Variables and method signatures can be “typed” by adding a type hint. The below example demonstrates the syntax; greet is a method that accepts a string parameter and returns a string, the variable greet_string is a string.
def greet(greetee: str) -> str:
greet_string: str
greet_string = 'Hello ' + greetee
return greet_string
Except, I can pass any type to this method and it will run without an error (greetee will get converted to its class string representation) . Type hints are just that: hints. They are meta information for developers and IDEs to better understand code, but they are ignored by the interpreter. This was a big gotcha for me and led to some baffling errors until I realised. And yes, I know, RTFM, but I did not see this aspect of type hints emphasised (this is actually what motivated me to write this article).
Type hints are vital for an IDE to be able to give meaningful code annotation (e.g. highlighting errors) but do not give any of the runtime benefits of strong typing. This becomes particularly important when refactoring Python code, which I’ll go into in more detail later.
Type hints really feel like a worst-of-both-worlds feature to me; without them you lose a lot of information and functionality in IDEs but for the overhead of using them everywhere you don’t get the security of strong typing.
Slots Declarations
Another Python feature that can be confusing at first is dynamic object attributes and the use of the slots declaration for classes. Object attributes (Python equivalent of member properties) on Python classes are dictionaries by default, and as such can be added to dynamically at any point. Consider the following code:
class Person():
def __init__(self):
self.first_name = ""
self.last_name = ""
def full_name(self):
return f"{self.first_name} {self.last_name}"
person1 = Person()
person1.first_name = "Kurt"
person1.last_name = "Russell"
person2 = Person()
person2.first_name = "Goldie"
person2.surname = "Hawn"
print(f"person1 full name is {person1.full_name()}")
print(f"person2 full name is {person2.full_name()}")
The output from this code is:
person1 full name is Kurt Russell
person2 full name is Goldie
The last name of the person2object is not being displayed as expected, because the surname attribute is mistakenly added as a dynamic attribute, instead of assigning to the statically declared attribute last_name.
This is a trivial example but demonstrates the kind of issues that dynamic properties can introduce, that developers coming from strongly typed languages may find confusing at first. Mistakes like this in larger codebases can sneak in unnoticed and be difficult to track down; viewing the person2 variable in the debugger looks correct until you dig into the class declaration itself.
A class with a __slots__ declaration will throw an exception at runtime if code tries to assign a dynamic attribute. It also improves memory performance of that class; this Stack Overlow answer has an incredibly comprehensive discussion on that.
Dynamic Typing
Dynamic typing can make for some elegant code when used in a disciplined way. I particularly like the flexibility of duck typing for utility methods. Consider the below method:
class Horse():
def __init__(self):
self.name = ""
class Boat():
def __init__(self):
self.name = Name()
def unique_names(named_objects: []):
unique_names = []
for named_object in named_objects:
if named_object.name not in unique_names:
unique_names.append(named_object.name)
return unique_names
Any collection of objects can be passed to the unique_names method as long as it has a name attribute, regardless of the type of that attribute (both string and Name types are used in the above example).
A strongly typed language can implement similar functionality with interfaces and inheritance, but with considerable development overhead. This is what I like Python for, rapid development of codebases with lower complexity that are relatively intuitive and easy to understand. Code you want to bang out fast with low or linear dependencies.
Something that I initially resisted but came to embrace is some of the “pythonic” syntax that provides terse, elegant code. I’m a big fan of the ability to make a boolean check against any type of object. None checks (Python’s null equivalent), empty lists and empty strings will all return False in a boolean statement (a comprehensive list of uses here). It makes for readable and intuitive code and behaves the way you’d expect:
tribbles = []
tribble1 = Tribble()
tribble1.name = "Foo"
# If tribble1 has a name assigned, add it to the list.
if tribble1.name:
tribbles.append(tribble1)
# If the list tribbles is not empty...
if tribbles:
print("You've got trouble!")
Modules, Packages and Imports
The basic concepts for working with modules and packages are straight-forward, but there are some unintuitive aspects that can cause some pain once a codebase starts to expand.
The basics of Python modules, packages and imports are simple enough to pick up. But in practice some of the nuances can trip you up.
Import Paths
A basic import statement like import my_module searches for my_module in the following locations:
- The directory from which the input script was run or the current directory if the interpreter is being run interactively
- Directories set in the PYTHONPATH environment variable.
- Directories configured when Python is installed.
This makes the imported module available to the importing code, under a namespace of the module’s name (e.g. my_module.foo()).
Import statements can be absolute or relative. It is tempting to use relative imports as they are often less verbose and easy to determine as you’re writing the code; I strongly recommend against this. Relative imports are more difficult to read and, most importantly, a complete nightmare to refactor. Python’s PEP8 standard also recommends absolute imports.
A Python package is created simply by including the init.py file in a folder; all .py files in that folder will be included in that package even if the init file is completely empty. But don’t leave it empty; using the init file to initialise your packages is the key to cleanly managing extended dependencies.
Importing a package does not inherently import all of its modules. Having import statements in the init file for all of the default imports saves you from having to explicitly import every module from a package, in every file you use that package. Just include that package’s import statements in init.py the same way you would in the calling modules. This saves repeatedly writing the same import statements and gives you a single point of change if you refactor module paths.
I have seen a lot of discussion and recommendations around using Python’s sys.path functionality to dynamically set module search paths (as an alternative to the PYTHONPATH environment variable). To my mind this seems like a nightmare for maintenance, but again leads us back to the question of what kind of codebases Python is best suited for. If having all of your environments with a correctly configured PYTHONPATH is more overhead than maintaining hardcoded dependencies littered throughout your code, perhaps it’s fine (no, it isn’t :P).
IDEs
Most C# developers will already be using Visual Studio and probably have at least some exposure to VS Code. I initially started working with Python in Visual Studio as I read some opinions that suggested it had better features for Python support than Code. I saw no evidence of this. Working with Python in Visual Studio was painful; it lacked error highlighting and was inconsistent at tracking dependencies (e.g. sometimes couldn’t F12 into a method in a different module). I recognise some of this may have been due to my inexperience with the language (I was initially doing some convoluted module imports) or configuration options. But after switching to VS Code everything just worked, and it was fast.
VS Code will prompt you to install the Python plugin the first time you open a .py file. Once that is installed you get a fast, lightweight IDE that handles all the expected highlighting, navigation and debugging. It will also require another plugin for even basic refactoring such as renaming a variable. It suggests the Roper plugin for this, the first time you right click a variable and select “Rename”. But even with this plugin I found the refactoring in Code very inconsistent, to the point of being unusable for anything more extensive than renaming within a narrow scope.
PyCharm is the main alternative to VS Code and it compares similarly to Code for much of their functionality. It has some advanced analytics options in the Pro (paid for) version that Code doesn’t have, if that’s of value to you. I found PyCharm’s UI generally cleaner and more productive to work with, I prefer its readability and code navigation. But the real differentiatior was PyCharm’s refactoring features. It offers more advanced options such as convert to package/module, extractions, pull member up/down, etc. I also found it to be more reliable for structural changes such as renaming a module across the entire project, which Code often only partially completed (although even PyCharm wasn’t 100% reliable like a refactor in Visual Studio C#). A caveat to this is that globally consistent use of type hints probably greatly improves the reliability of refactoring, in any IDE.
For many the choice will come down to whether or not they are already using VS Code for other workflows, which was precisely my situation for some time. But after trying PyCharm I found that it offered enough benefit to make the switch.
Code Style and Standards
Something worth considering when picking up Python is the coding style and standards you want to apply. Will you stick to what you are using in other languages, or convert to the “pythonic” style? PEP8 is the industry standard for Python and it is considerably different from most language norms.
“Snake case” is the recommended naming syntax, with variable, method and module names all lower-case with underscore separators, with only class names using pascal case.
Should You Use Python?
My most extensive experience with Python was with a codebase that started out as some scripts to generate boilerplate C# code. Over time these scripts expanded into an extensive application that was at the core of an enterprise .NET stack. At a point in time the team recognised that the code generation scripts needed to be converted into a more robust and maintainable application so we undertook a major refactoring and consolidation process. In hindsight, we regret not taking the opportunity to convert the entire code generation application to C#.
Python is great for rapid development of scripts or self-contained applications. It has powerful and elegant syntax for algorithms, file processing, pattern matching, and an extensive ecosystem of high quality libraries for just about any task you might need. But the overhead and inconsistency of implementing “strongish” typing and the unreliability of complex refactoring makes, in my experience, development and maintenance of a complex application more onerous than other languages.