fpdf2.5.2 : SVG support and borb

fpdf2 is a simple & fast PDF creation library for Python that I have been maintaining since mid-2020.

In this article, I'm going to present some of the new features that landed since my last post on the subject. Hence, this will cover versions 2.5.0, 2.5.1 & 2.5.2 of fpdf2. I will also perform a quick comparison with the borb library.

https://github.com/pyfpdf/fpdf2/ Pypi latest version Doc: https://pyfpdf.github.io/fpdf2/

Support for SVG (Scalable Vector Graphics)

Thanks to @torque who worked on this over several months and produced very high quality code, fpdf2 now supports embedding SVG files!

He actually started by implementing a very large part of the PDF drawing API, allowing to compose arbitrary sequences of paths, lines and curves: fpdf2 Drawing API documentation.

As a direct consequence, it became possible to build a direct SVG-to-PDF converter, which he did!

SVG files can now be directly added to a PDF file using the image() method: fpdf2 SVG documentation.

For security reasons, with the addition of this feature we added a new dependency to fpdf2: defusedxml, used to check that embedding a SVG file does not trigger a denial of service, for example a Billion laughs attack.

While the SVG converter has some limitations, it is able, for example, to perfectly render the famous SVG example Ghostscript_Tiger.svg to PDF: Ghostscript_Tiger.pdf.

Ghostscript Tiger PDF preview

Other features

Two other useful features were added by Georg Mischler:

  • support for soft-hyphen (\u00ad) break in write(), cell() & multi_cell(), cf. documentation on line breaks
  • new parameters new_x and new_y were introduced for cell() and multi_cell() methods, in order to make cursor position after cell-rendering a lot more intuitive & user-friendly, cf. related documentation

Georg Mischler also took some time to revise the whole structure of the documentation, making it a lot more user-friendly. Thanks! 🙏

I also contributed a few extra functionalities:

  • a new add_highlight() method to insert highlight annotations: documentation
  • support for new PDF properties: .text_mode (documentation) & .blend_mode (documentation)
  • new round_clip() & elliptic_clip() image clipping methods: documentation (not released yet, planned for v2.5.3)

Usage examples with other libs

A few additions were made to the documentation to provide usage examples of fpdf2 with other libraries:

Would you like other examples being provided? If so, drop a comment at the bottom of the page, or open a discussion / issue, explaining with which library you would like to combine fpdf2 with 😊

Deprecation notice

First, DeprecationWarning messages are not displayed by Python by default.

Hence, every time you use a newer version of fpdf2, we strongly encourage you to execute your scripts with the -Wd option (cf. documentation) in order to get warned about deprecated features used in your code. This can also be enabled programmatically with warnings.simplefilter('default', DeprecationWarning).

Now, there are the notable recent API changes in fpdf2:

  • the font caching mechanism, that used the pickle module, has been removed, for security reasons, and because it provided little performance gain (cf. issue #345). That means that the font_cache_dir optional parameter of fpdf.FPDF constructor and the uni optional argument of add_font() are deprecated: uni=True can now be removed from all calls to add_font().

  • the parameter ln to cell() and multi_cell() is now deprecated: use new_x and new_y instead.

borb

In November of 2020, Joris Schellekens released another excellent pure-Python library dedicated to reading & write PDF: borb. He even wrote a very detailed e-book about it, available publicly there: borb-examples.

In many ways, borb excels in areas where fpdf2 has gaps: it has a very clean and well-structure code API, with well-defined PDF primitive data-types and type hints (checked with mypy), it offers several options for pages layout, it can parse PDF files and even extract tables, it even allows you to insert forms or Javascript code.

If ever you want to combine usage of borb and fpdf, we provide some guidance in doing so: documentation.

borb vs fpdf2

I have 2 intents in drawing this comparison:

  • help Python coders chose the library that best fit their need
  • figure if fpdf2 is indeed the fastest of the 2 libraries, as I suspect 😁

First, there are a couple of features that only fpdf2 offers: SVG support (borb rasters .svg files to pixelated images) and some useful methods to generate a table of contents. On the other hand, borb offers many other features not provided by fpdf2... So from the point of functionality, borb is much more complete.

Second, I think fpdf2 CD/CI pipeline is a bit more powerful (YAML source / GitHub Actions pipeline execution): we run hundreds of unit tests based on PDF reference files, with 3 validators checking the PDF files generated, and we test all this with the 4 latest version of Python 3. We also use Pylint & bandit. borb current CD/CI pipeline currently does none of this, while the number of unit tests of the two libraries is comparable (346 for borb, 390 for fpdf2). Added with the fact that fpdf2 has been in use for a longer time (since 2006), I'd say that fpdf2 is slightly more robust than borb.

Third and finally, let's benchmark! 💥💨

I wrote the following script to compare borb & fpdf2 performances on a specific usage scenario: benchmark_borb_vs_fpdf2.py.

There is its execution result on my computer:

Speed benchmark: how much time each lib takes to generate a 10 thousands pages PDF with ~180 distinct images?
  (disclaimer: the author of this benchmark is fpdf2 current maintainer)
Versions tested: borb version 2.0.24 VS fpdf2 v2.5.2

Benchmarking fpdf2...
Memory usage peak (resource.ru_maxrss): 47MB
Generated PDF file size: 2.68MB
Duration: 1.87s

Benchmarking borb...
Memory usage peak (resource.ru_maxrss): 646MB
Generated PDF file size: 0.36MB
Duration: 71.35s

There are the resulting PDF files:

Now, what can we conclude in this usage scenario?

  • fpdf2 is faster than borb (by a factor ~40)
  • fpdf2 use less memory than borb (by a factor ~15)
  • borb produces smaller PDFs than fpdf2 (by a factor ~10), but at a cost: if you check the files produced, the images in the PDF made with borb contain visible artifacts due to compression

Finally, while crafting this benchmark script, I triggered several crashes of borb. There are documented as comments in the script (it did not like some PNG files, and opening many images caused an OSError: [Errno 24] Too many open files) and I think this supports my previous analysis regarding fpdf2 robustness compared to borb.

I hope that this comparison will be useful to pythonistas around here. I'd like to stress that I really admire & respect Joris Schellekens work on borb, and that this analysis may be biased by the fact that I'm fpdf2 current maintainer. I encourage every reader to make their own list of criteria that matter, depending on their own project.


That's it for today regarding fpdf2! You can also check the detailed CHANGELOG for an exhaustive list of all changes: bug fixes, other minor improvements and deprecation notices.

I'd like to give a shout-out to all fpdf2 contributors, and especially @torque & Georg Mischler, for continually improving this library through code contributions, bug reports, improved documentation, translations, etc.

Thank you!

I'd like to end this article with an announcement: after making Undying Dusk last year, I am now working on a new PDF game! It will be made using fpdf2 again, but this time I'm working with French illustrator Elliot Jolivet aka Tenseï who will provide all the game visuals. More about this soon!

Now, I wish you all to have a lot of fun building PDFs with fpdf2 & borb!