fpdf2
is a simple & fast PDF creation library for Python that I have been maintaining since mid-2020.
In this article, I'm going to present some of the new features that landed since my last post on the subject.
Hence, this will cover versions 2.5.0, 2.5.1 & 2.5.2 of fpdf2
.
I will also perform a quick comparison with the borb library.
https://github.com/pyfpdf/fpdf2/ Doc: https://pyfpdf.github.io/fpdf2/
Support for SVG (Scalable Vector Graphics)
Thanks to @torque who worked on this over several months
and produced very high quality code, fpdf2
now supports embedding SVG files!
He actually started by implementing a very large part of the PDF drawing API, allowing to compose arbitrary sequences of paths, lines and curves: fpdf2 Drawing API documentation.
As a direct consequence, it became possible to build a direct SVG-to-PDF converter, which he did!
SVG files can now be directly added to a PDF file using the image() method: fpdf2 SVG documentation.
For security reasons, with the addition of this feature we added a new dependency to fpdf2
:
defusedxml, used to check that embedding a SVG file does not trigger a denial of service,
for example a Billion laughs attack.
While the SVG converter has some limitations, it is able, for example, to perfectly render the famous SVG example Ghostscript_Tiger.svg to PDF: Ghostscript_Tiger.pdf.
Other features
Two other useful features were added by Georg Mischler:
- support for soft-hyphen (
\u00ad
) break inwrite()
,cell()
&multi_cell()
, cf. documentation on line breaks - new parameters
new_x
andnew_y
were introduced forcell()
andmulti_cell()
methods, in order to make cursor position after cell-rendering a lot more intuitive & user-friendly, cf. related documentation
Georg Mischler also took some time to revise the whole structure of the documentation, making it a lot more user-friendly. Thanks! 🙏
I also contributed a few extra functionalities:
- a new
add_highlight()
method to insert highlight annotations: documentation - support for new PDF properties:
.text_mode
(documentation) &.blend_mode
(documentation) - new
round_clip()
&elliptic_clip()
image clipping methods: documentation (not released yet, planned forv2.5.3
)
Usage examples with other libs
A few additions were made to the documentation to provide usage examples of fpdf2
with other libraries:
- with Django, Flask, streamlit, AWS lambdas: cf. documentation
- with Matplotlib to embed charts & equations: cf. documentation
- with SQLAlchemy to store PDFs in a database: cf. documentation
- with pdfrw to modify existing PDFs: cf. documentation
Would you like other examples being provided?
If so, drop a comment at the bottom of the page, or open a discussion / issue,
explaining with which library you would like to combine fpdf2
with 😊
Deprecation notice
First, DeprecationWarning
messages are not displayed by Python by default.
Hence, every time you use a newer version of fpdf2
, we strongly encourage you to execute your scripts
with the -Wd
option (cf. documentation)
in order to get warned about deprecated features used in your code.
This can also be enabled programmatically with warnings.simplefilter('default', DeprecationWarning)
.
Now, there are the notable recent API changes in fpdf2
:
-
the font caching mechanism, that used the
pickle
module, has been removed, for security reasons, and because it provided little performance gain (cf. issue #345). That means that thefont_cache_dir
optional parameter offpdf.FPDF
constructor and theuni
optional argument ofadd_font()
are deprecated:uni=True
can now be removed from all calls toadd_font()
. -
the parameter
ln
tocell()
andmulti_cell()
is now deprecated: usenew_x
andnew_y
instead.
borb
In November of 2020, Joris Schellekens released another excellent pure-Python library dedicated to reading & write PDF: borb. He even wrote a very detailed e-book about it, available publicly there: borb-examples.
In many ways, borb
excels in areas where fpdf2
has gaps:
it has a very clean and well-structure code API, with well-defined PDF primitive data-types and type hints (checked with mypy
),
it offers several options for pages layout,
it can parse PDF files and even extract tables,
it even allows you to insert forms or Javascript code.
If ever you want to combine usage of borb
and fpdf
,
we provide some guidance in doing so: documentation.
borb vs fpdf2
I have 2 intents in drawing this comparison:
- help Python coders chose the library that best fit their need
- figure if
fpdf2
is indeed the fastest of the 2 libraries, as I suspect 😁
First, there are a couple of features that only fpdf2
offers: SVG support (borb
rasters .svg
files to pixelated images)
and some useful methods to generate a table of contents.
On the other hand, borb
offers many other features not provided by fpdf2
...
So from the point of functionality, borb is much more complete.
Second, I think fpdf2
CD/CI pipeline is a bit more powerful (YAML source / GitHub Actions pipeline execution):
we run hundreds of unit tests based on PDF reference files, with 3 validators checking the PDF files generated,
and we test all this with the 4 latest version of Python 3. We also use Pylint & bandit.
borb
current CD/CI pipeline currently does none of this,
while the number of unit tests of the two libraries is comparable (346 for borb
, 390 for fpdf2
).
Added with the fact that fpdf2
has been in use for a longer time (since 2006),
I'd say that fpdf2
is slightly more robust than borb
.
Third and finally, let's benchmark! 💥💨
I wrote the following script to compare borb
& fpdf2
performances on a specific usage scenario:
benchmark_borb_vs_fpdf2.py.
There is its execution result on my computer:
Speed benchmark: how much time each lib takes to generate a 10 thousands pages PDF with ~180 distinct images?
(disclaimer: the author of this benchmark is fpdf2 current maintainer)
Versions tested: borb version 2.0.24 VS fpdf2 v2.5.2
Benchmarking fpdf2...
Memory usage peak (resource.ru_maxrss): 47MB
Generated PDF file size: 2.68MB
Duration: 1.87s
Benchmarking borb...
Memory usage peak (resource.ru_maxrss): 646MB
Generated PDF file size: 0.36MB
Duration: 71.35s
There are the resulting PDF files:
Now, what can we conclude in this usage scenario?
fpdf2
is faster thanborb
(by a factor ~40)fpdf2
use less memory thanborb
(by a factor ~15)borb
produces smaller PDFs thanfpdf2
(by a factor ~10), but at a cost: if you check the files produced, the images in the PDF made withborb
contain visible artifacts due to compression
Finally, while crafting this benchmark script, I triggered several crashes of borb
.
There are documented as comments in the script (it did not like some PNG files, and opening many images caused an OSError: [Errno 24] Too many open files
)
and I think this supports my previous analysis regarding fpdf2
robustness compared to borb
.
I hope that this comparison will be useful to pythonistas around here.
I'd like to stress that I really admire & respect Joris Schellekens work on borb
,
and that this analysis may be biased by the fact that I'm fpdf2
current maintainer.
I encourage every reader to make their own list of criteria that matter, depending on their own project.
That's it for today regarding fpdf2
!
You can also check the detailed CHANGELOG
for an exhaustive list of all changes: bug fixes, other minor improvements and deprecation notices.
I'd like to give a shout-out to all fpdf2
contributors,
and especially @torque & Georg Mischler,
for continually improving this library through code contributions, bug reports, improved documentation, translations, etc.
I'd like to end this article with an announcement: after making Undying Dusk last year,
I am now working on a new PDF game! It will be made using fpdf2
again, but this time I'm working with French illustrator Elliot Jolivet aka Tenseï
who will provide all the game visuals. More about this soon!
Now, I wish you all to have a lot of fun building PDFs with fpdf2
& borb
!