desgeeko/pdfsyntax

readfile: TypeError: 'NoneType' object is not subscriptable

masc-it opened this issue · 1 comments

Hi! I am trying to load this pdf but I have the following error. Any ideas?

I have tested other pdfs, it happens all the time..

Thanks for your work.

{
	"name": "TypeError",
	"message": "'NoneType' object is not subscriptable",
	"stack": "---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 doc = readfile(\"/Users/mascit/Downloads/pdftests tests/dino copy.pdf\")

File ~/miniconda3/envs/pdftests/lib/python3.10/site-packages/pdfsyntax/api.py:69, in readfile(filename)
     67 \"\"\"Read file and initialize doc.\"\"\"
     68 with open(filename, 'rb') as file_obj:
---> 69     doc = load(file_obj, \"SINGLE\")
     70 return doc

File ~/miniconda3/envs/pdftests/lib/python3.10/site-packages/pdfsyntax/api.py:57, in load(file_obj, mode)
     55 \"\"\"Load from file.\"\"\"
     56 fdata = bdata_provider(file_obj, mode)
---> 57 return doc_constructor(fdata)

File ~/miniconda3/envs/pdftests/lib/python3.10/site-packages/pdfsyntax/api.py:50, in doc_constructor(fdata)
     48 cache = build_cache(fdata, index)
     49 doc_initial = Doc(index, cache, data)
---> 50 doc_new_rev = commit(doc_initial)
     51 return doc_new_rev

File ~/miniconda3/envs/pdftests/lib/python3.10/site-packages/pdfsyntax/docstruct.py:332, in commit(doc)
    330 def commit(doc: Doc) -> Doc:
    331     \"\"\"Add new index for incremental update.\"\"\"
--> 332     if len(changes(doc)) == 0:
    333         return doc
    334     nb_rev = len(doc.index)

File ~/miniconda3/envs/pdftests/lib/python3.10/site-packages/pdfsyntax/docstruct.py:175, in changes(doc, rev)
    173     previous = doc.index[rev-1]
    174 for i in range(1, len(current)):
--> 175     iref = get_iref(doc, i, rev)
    176     if i > len(previous)-1:
    177         res.append((iref, 'a'))

File ~/miniconda3/envs/pdftests/lib/python3.10/site-packages/pdfsyntax/docstruct.py:151, in get_iref(doc, o_num, rev)
    149 \"\"\"Build the relevant indirect reference for o_num in a doc revision.\"\"\"
    150 current = doc.index[rev]
--> 151 o_gen = current[o_num]['o_gen']
    152 return complex(o_gen, o_num)

TypeError: 'NoneType' object is not subscriptable"
}

Hi @masc-it !
Thank you for your feedback.
I have just pushed a fix. It was the first time I encountered a PDF file with free entries (unused object numbers) in the first revision of a XREF table.