__slots__ memory optimization in Python

Illustration from realpython.com

The other day, while working on fpdf2, I used @dataclass, a nice decorator that came in the standard library with Python 3.7, to quickly define a class that mostly stored data.

Then a question came to my mind: is the __slots__ memory optimization compatible with @dataclass? Is it even compatible?.

This very short article is basically an opportunity to answer those questions with some minimal code, mostly as a reminder to myself:

#!/usr/bin/env python
import os, sys
from collections import namedtuple
from dataclasses import dataclass
from typing import NamedTuple

def get_process_rss():  # Similar to: psutil.Process().memory_info().rss / 1024 / 1024
    try:
        with open(f"/proc/{os.getpid()}/statm", encoding="utf8") as statm:
            rss_as_mib = int(statm.readline().split()[1]) * os.sysconf("SC_PAGE_SIZE") / 1024 / 1024
        return f"{rss_as_mib:.1f} MiB"
    except FileNotFoundError:  # /proc files only exist under Linux
        return "<unavailable>"

class A:
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

class B:
    __slots__ = ('x', 'y', 'z')
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

@dataclass
class C:
    x: int
    y: int
    z: int

@dataclass(slots=True)  # since Python 3.10, this is the same as defining __slots__ manually
class D:
    x: int
    y: int
    z: int

E = namedtuple('E', ('x', 'y', 'z'))

class F(NamedTuple):
    x: int
    y: int
    z: int

Class = locals()[sys.argv[1].upper()]
l = []
for _ in range(100000):
    l.append(Class(0, 1, 2))
print(get_process_rss())

Results on my machine with Python 3.8:

$ ./slots_test.py a
26.6 MiB
$ ./slots_test.py b
17.2 MiB
$ ./slots_test.py c
26.7 MiB
$ ./slots_test.py d
17.2 MiB
$ ./slots_test.py e
19.1 MiB
$ ./slots_test.py f
19.2 MiB

Results on my machine with Python 3.10 in debug mode:

$ ./slots_test.py a
28.3 MiB
$ ./slots_test.py b
22.0 MiB
$ ./slots_test.py c
28.3 MiB
$ ./slots_test.py d
22.0 MiB
$ ./slots_test.py e
24.8 MiB
$ ./slots_test.py f
24.2 MiB

We can conclude that:

  • __slots__ is still an effective memory optimization with recent versions of Python, that can provide between 10% en 30% of RAM savings
  • __slots__ can effectively be combined with @dataclass
  • namedtuple / NamedTuple is less memory-efficient than __slots__ (and both cannot be combined)

Related content:

Well aware of the limitations of __slots__, I will definitely adopt more of them in fpdf2 😊

Since december, I have been working on tracking memory allocations occuring during the execution of fpdf2 unit tests suite, and this is no easy task: issue #641. I stil haven't found what I'm looking for, the main difficulty being the opacity of the Python memory allocator, and tracking the memory allocated through malloc calls by libraries that fpdf2 depends on, but it has been very insightful so far! 😁