py-pdf/pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

PythonNOASSERTION

Issues

Shutdown of Apache Tika Corpora
#3035 opened 2 days ago by stefan6419846
8
Assertion fails when adding metadata after cloning document root
#3036 opened a day ago by juwu-odoo
4
Support for PANTONE colors
#3033 opened 2 days ago by stefan6419846
4
Improve handling of LZW decoder table overflow
#3032 opened 2 days ago by stefan6419846
0
Assert fails when getting the mediabox property for certain PDFs
#2991 opened a month ago by Paethon
13
Intermittent `IndexError` when accessing `PdfReader.pages` with `ThreadPoolExecutor`
#3024 opened 12 days ago by blairfrandeen
3
Question about documentation for xmp_metadata.dc_description and xmp_metadata.dc_subject
#3023 opened 15 days ago by dfkettle
7
TypeError when extracting text from PDF: Unsupported operand type(s) for '/' (IndirectObject and float)
#3020 opened 16 days ago by HEKUCHAN
1
pypdf.errors.PdfReadError: startxref not found
#3017 opened 19 days ago by neeraj9
1
Crash during page text extraction
#2975 opened 19 days ago by neeraj9
1
ValueError: Ascii85 encoded byte sequences must end with b'~>'
#2996 opened 22 days ago by neeraj9
7
TypeError: unhashable type: 'ArrayObject' when reading inline images
#2998 opened 22 days ago by neeraj9
1
Generated single page PDF is huge
#3011 opened 22 days ago by Vafilor
0
Refactor regular text extraction into dedicated module
#3010 opened 22 days ago by stefan6419846
0
binascii.Error: Non-hexadecimal digit found extracting CMap
#2997 opened a month ago by neeraj9
6
`PageObject.transfer_rotation_to_content()` hides some content since pypdf 4.3.0
#2927 opened 23 days ago by stefan6419846
5
AttributeError: 'DictionaryObject' object has no attribute 'get_data'
#2995 opened a month ago by neeraj9
3
PdfReadError: Image data is not rectangular
#2993 opened a month ago by Verdant31
2
Collapsing outlines not working: parameter is_open in add_outline_item has no effect
#2994 opened a month ago by dowo-2987
0
Capitalization in metadata
#2992 opened a month ago by dfkettle
4
Update namespace links in xmp.py
#2951 opened a month ago by j-t-1
6
Text visitor example in docs does not work
#2881 opened a month ago by lucasgadams
6
Typing issue when updating from 3.9.1, suggested documentation update
#2949 opened a month ago by thomas-forte
5
Exception on indirect object during text extraction
#2966 opened a month ago by nsw42
0
PdfWriter().append throwing 'NullObject' object is not subscriptable for a specific PDF file
#2958 opened 2 months ago by eth-wa
1
Transferred Annotations not Rendering Correctly
#2960 opened 2 months ago by eth-wa
5
Local variable 'v' referenced before assignment causing exception.
#2959 opened 2 months ago by ajrlewis
2
Unable to load some certain PDFs using the latest version.
#2948 opened 2 months ago by AliFaridCollide
1
`DocumentInformation.title` sometimes return `bytes` instead of `str`
#2929 opened 2 months ago by reformy
5
Undefined variable in text extraction with version 5.1.0
#2925 opened 2 months ago by thomasht86
1
`UnboundLocalError` error when extracting text
#2933 opened 2 months ago by vodkar
1
'extract_text' text matrix seems to be sometimes broken with v5.1.0
#2932 opened 2 months ago by remi-braun
1
Inverted colors when extracting CMYK image
#2931 opened 2 months ago by AnzhiZhang
1
#7 Using PdfReader causes a crash
#2886 opened 3 months ago by Avgor46
2
PdfWriter.write() in context manager closes stream when it should not
#2905 opened 2 months ago by alexaryn
5
ENH: Ensure PyPI marks URLs as "verified"
#2892 opened 3 months ago by MartinThoma
6
Regression when reading partially broken PDF files
#2926 opened 2 months ago by stefan6419846
0
DEV: Switch to latest pinned dependencies
#2914 opened 3 months ago by stefan6419846
5
Add an argument ``layout_mode_height_weight`` to control inference of vertical space when extracting text in layout mode
#2915 opened 3 months ago by hpierre001
1
DEV: Mirror freely licensed arXiv documents locally
#2904 opened 3 months ago by stefan6419846
3
Cloning errors when using context manager
#2912 opened 3 months ago by pubpub-zz
2
Images merged between pages
#2923 opened 3 months ago by pprados
1
How to remove watermark with pypdf2
#2916 opened 3 months ago by Estelle-gqy
0
Fails to convert date to date object if not in correct ISO format
#2908 opened 3 months ago by jojo2357
4
How to extract internal links using PyPDF
#2910 opened 3 months ago by swathiJayav
1
PdfReadError: Too many lookup values while extracting image
#2889 opened 3 months ago by michelcrypt4d4mus
1
Adobe Requires Annotation Flag for Printing Annotations
#2896 opened 3 months ago by MarleTangible
3
#6 Using PdfReader causes a crash
#2875 opened 3 months ago by Avgor46
1
`PdfReader` causes memory overflow for a particular PDF
#2876 opened 3 months ago by JaMe76
7
BUG: infinite loop on damaged pdf file
#2877 opened 3 months ago by pubpub-zz
3