Skip to main content

Oh Shit, Python

info

You can access to this page from ohshitpython.com, ohshit.foo/python or ohshit.bar/python.

TLDR

  • Beware of GIL.
  • Don't give default values to mutable variables, either in function parameters or classes.
  • In contrast to using in C# or include in C++, import executes modules and allows you to have interdependent modules. Make sure you understand how modules and packages work before using them.
  • Python offers a variety of syntactic sugars, such as the ability to view class attributes through the __dict__ attribute and the use of @property decorators for getter and setter methods. However, these features, if misused, can make it easier to bypass certain limitations and safeguards, potentially leading to less predictable code.

Objects

Everything is an Object

  • Objects can be mutable or immutable
  • Immutable types are the following:
    • int
    • float
    • string
    • boolean
    • tuple
    • complex
    • bytes
    • frozen set
    • NoneType: the type of None
  • You can use dir() to get a list of available attributes of an object.

References and Reference Counting

  • Variables are just references to objects.

  • Immutable objects are created once and can be referenced multiple times.

  • Python counts references to each object, and destroys it if there's no reference to it. You can use sys.getrefcount(object) to check reference count.

    import sys

    sys.getrefcount(1)
    sys.getrefcount(42)
    sys.getrefcount(None)
    sys.getrefcount("hello")
    sys.getrefcount("sys")
    sys.getrefcount("import")

    There are other ways to count references as well.

Global Interpreter Lock (GIL)

To make reference counting thread safe and prevent race conditions, there need to be a global lock when these counts need to be updated. This approach has performance penalty on multithreaded applications. You can read more about GIL in here.

The existence of GIL means that Python scripts cannot utilize multiple cores even when they are multithreaded. Python will eventually remove GIL. See the following links for more information:

tip

For I/O bound tasks, you can use asyncio to utilize multiple cores. For CPU bound tasks, you have to use multiprocessing to utilize multiple cores but multiprocessing is not as straightforward as multithreading and comes with its own problems.

Strings

  • Strings are sequence types.

    • You can do if "ello" in "hello": print("yes")
    • You can iterate over them: for c in "hello": print(c)
    • You can slice them with indexes "hello"[1:]
  • Multiline strings are used for multiline comments. @gvanrossum's tweet

  • When used as docstrings, multiline strings are parsed and become accessible through the __doc__ property of the object.

    def test():
    """This is a test function"""
    return 1

    print(test.__doc__) # prints "This is a test function"

Dictionaries

  • Iterating over dictionaries returns keys by default but not in any particular order.

Truthiness

  • Every value in Python is either evaluated as True or False
  • The following values are evaluated as False. Everything else is evaluated as True:
    • None
    • False
    • 0
    • 0.0
    • 0j
    • Decimal(0)
    • Fraction(0, 1)
    • "", [], {}, (), b'', set(), range(0), and other empty instances of subclasses of collections.abc
    • Objects for which __bool__ or __len__ method returns False

Type Checking

  • As demonstrated in the examples from collections.abc, isinstance() is more concerned with whether an object implements certain APIs than if it is a direct subclass of a specific type.

    class E:
    def __iter__(self): ...
    def __next__(next): ...

    isinstance(E(), Iterable) # True

Functions

  • Functions can modify variables outside of their scope by using global or nonlocal keyword. This is not recommended.

  • Functions that don't return anything return None by default.

  • Mutable default arguments are evaluated once when the function is defined. This means that if you use a mutable default argument and modify it, the modified value will be used in the next function call.

    def test(a=[]):
    a.append(1)
    print(a)

    test() # prints [1]
    test() # prints [1, 1]
    test() # prints [1, 1, 1]

    For this reason, you should use immutable (types like None in this case) as default arguments. You can read more about this in here.

Modules and Packages

  • import loads and runs the module once, then caches it for subsequent imports.
    • Because of this runtime execution behavior, you can have interdependent modules with some caveats. See here for more information and examples.
  • Each module has a __name__ property. If the module is executed directly, __name__ is set to "__main__". If the module is imported, __name__ is set to the module's name.
  • __init__.py files make a directory a package. This file executed when the package is imported.
    • To reference another module in the same package, you can use relative imports. See here for more information.
    • You can use python -m <package>.<module> to execute a module inside a package. This is useful when you want to test a module inside a package without creating a separate script.

Classes

  • Classes in Python are dictionaries with syntactic sugar. You can access class attributes with __dict__ property.

    • This means that private attributes are not really private. You can access them with instance._Class__private_attribute syntax.
    • To optimize memory usage, you can utilize __slots__ to specify a set of valid attribute names.
  • __init__ is not a constructor.

  • super isn't like base on C family languages.. It doesn't call the parent but the next class in the method resolution order.

  • If get_<property_name> and set_<property_name> methods are defined, they'd be used when you work with instance.<property_name>. See here for more information.

  • Variables that are defined under the class definition are class properties. They are shared between instances. Mutable class properties are disasters waiting to happen.

    class Test:
    a = [] # This is a dangerous class property
  • Methods that are defined under the class definition with @classmethod decorator are class methods. The first argument of a class method is the class itself.

    class Test:
    a = 1
    @classmethod
    def test(cls):
    print(f"test {cls.a}") # note the cls argument
  • Class properties and methods don't need an instance to be accessed. You can access them with Class.<property_name> or Class.<method_name>().

  • Methods that are defined under the class definition with @staticmethod decorator are static methods. They don't have access to the class or the instance. They are just like regular functions that are defined inside the class. You can read more about the differences between class methods and static methods in this StackOverflow answer.

  • Dataclasses define classes that are used to store data. They are like structs in C.

note

Check out Super considered super! talk by Raymond Hettinger for more information about classes.

More Resources