/pikepdf

A Python library for reading and writing PDF, powered by qpdf

Primary LanguageC++OtherNOASSERTION

pikepdf

pikepdf is a Python library for reading and writing PDF files.

Travis CI build status (Linux and macOS) AppVeyor CI build status (Windows) PyPI

pikepdf is based on QPDF, a powerful PDF manipulation and repair library.

Python + QPDF = "py" + "qpdf" = "pyqpdf", which looks like a dyslexia test. Say it out loud, and it sounds like "pikepdf".

Python 3.5, 3.6 and 3.7 are fully supported.

To install:

pip install pikepdf

Key features:

  • Editing, manipulation and transformation of existing PDFs
  • Based on the mature, proven QPDF C++ library
  • Works with encrypted PDFs
  • Supports all PDF compression filters
  • Can create "fast web view" (linearized) PDFs
  • Creates standards compliant PDFs that pass validation in other tools
  • Automatically repairs damaged PDFs, just like QPDF
  • Implements more of the PDF specification than existing Python PDF tools
  • IPython notebook and Jupyter integration
# Elegant, Pythonic API
pdf = pikepdf.open('input.pdf')
num_pages = len(pdf.pages)
del pdf.pages[-1]
pdf.save('output.pdf')

pikepdf is documented and actively maintained. Commercial support is available.

Feature comparison

This library is similar to PyPDF2 and pdfrw - it provides low level access to PDF features and allows editing and content transformation of existing PDFs. Some knowledge of the PDF specification may be helpful. It does not have the capability to render a PDF to image.

Python 2.7 and earlier versions of Python 3 are not currently supported but support is probably not difficult to achieve. Pull requests are welcome.

Feature pikepdf PyPDF2 pdfrw
Editing, manipulation and transformation of existing PDFs
Based on an existing, mature PDF library QPDF
Implementation speed C++ Python Python
PDF versions supported 1.1 to 1.7 1.3? 1.7
Python versions supported 3.5-3.7 2.6-3.6 2.6-3.6
Supports password protected (encrypted) PDFs ✔ (except public key) Only obsolete RC4
Save and load PDF compressed object streams (PDF 1.5)
Creates linearized ("fast web view") PDFs
Actively maintained commits pypdf2-commits pdfrw-commits
Test suite coverage ~86% very low unknown
Creates PDFs that pass PDF validation tests ?
Modifies PDF/A without breaking PDF/A compliance ?
Automatically repairs PDFs with internal errors
Documentation
Integrates with Jupyter and IPython notebooks for rapid development

License

pikepdf is provided under the Mozilla Public License 2.0 license (MPL) that can be found in the LICENSE file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. We exclude Exhibit B, so pikepdf is compatible with secondary licenses. At your option may additionally distribute pikepdf under a secondary license.

Informally, MPL 2.0 is a not a "viral" license. It may be combined with other work, including commercial software. However, you must disclose your modifications to pikepdf in source code form. In other works, fork this repository on Github or elsewhere and commit your contributions there, and you've satisfied the license.

The tests/resources/copyright file describes licensing terms for the test suite and the provenance of test resources.