4234 shaares
5 results
tagged
CPython
This describes tools and techniques that can identify memory leaks in Long running Python programs:
- Is it a Leak?
- Sources of Leaks
- A Bit About (C)Python Memory Management
- Reference Counts
- Garbage Collection
- The Big Picture
- CPython’s Object Allocator (pymalloc)
Here is a visualisation of memory allocators from top to bottom (from the Python source Objects/obmalloc.c):
_____ ______ ______ ________
[ int ] [ dict ] [ list ] ... [ string ] Python core |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
_______________________________ | |
[ Python's object allocator ] | |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
______________________________________________________________ |
[ Python's raw memory allocator (PyMem_ API) ] |
+1 | <----- Python memory (under PyMem manager's control) ------> | |
__________________________________________________________________
[ Underlying general-purpose allocator (ex: C library malloc) ]
0 | <------ Virtual memory allocated for the python process -------> |
=========================================================================
_______________________________________________________________________
[ OS-specific Virtual Memory Manager (VMM) ]
-1 | <--- Kernel dynamic storage allocation & management (page-based) ---> |
__________________________________ __________________________________
[ ] [ ]
-2 | <-- Physical memory: ROM/RAM --> | | <-- Secondary storage (swap) --> |
- Input injection
- Parsing XML
- Assert statements
- Timing attacks
- A polluted site-packages or import path
- Temporary files
- Using yaml.load
- Pickles
- Using the system Python runtime and not patching it
- Not patching your dependencies
Plusieurs solutions:
- L’API C de CPython. Y’en a qu’ont essayé, ils ont eu des problèmes…
En utilisant cette méthode, vous aurez accès à tout l’interpréteur Python et vous pourrez tout faire… mais à quel prix ?
Je ne vous recommande pas cette approche, je me suis cassé les dents pendant 4 mois dessus avec un succès mitigé.- Cython est une bonne solution mais plus orienté sur l’optimisation de code.
- CFFI: le saint Graal, alléluia!
Il y a trois moyens d’utiliser CFFI, comprenez bien cela car c’est la partie tricky:
- Le mode ABI/Inline
- Le mode API/Out-of-line
- Le mode ABI/Out-of-line
TL;DR:
- List Resizing : the backing array is grown by approximately 12% (in Java ArrayList grows by 50% when expanded2 and in Ruby, Array grows by 100%). The Python implementation optimizes for memory usage over speed. Another reason to preallocate Python lists when possible.
- Inserting at the beginning of a list takes linear time. Sometimes, better use Deques which trade constant time insert and remove from both ends in exchange for constant time indexing.
By Russel Cohen
I often hear people who are happy because PyPy makes their code 2 times faster or so. Here is a short personal story which shows PyPy can go well beyond that.