'PDFObjRef' object has no attribute '__getitem__'
Closed this issue · 1 comments
andreicalistru commented
Hello,
I'm trying to parse some pdf files using pdfquery and it seems that for a couple of pdf's(not all of them) I receive the following error:
File "my_path/my_script.py", line 244, in set_description pdf.load()
File "/my_path/.virtualenvs/dev/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 373, in load
self.tree = self.get_tree(*_flatten(page_numbers))
File "/my_path/.virtualenvs/dev/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 475, in get_tree
for n, page in pages:
File "/my_path/.virtualenvs/dev/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 596, in <genexpr>
return (self.get_layout(page) for page in self._cached_pages())
File "/my_path/.virtualenvs/dev/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 591, in get_layout
layout = self._add_annots(layout, page.annots)
File "/my_path/.virtualenvs/dev/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 639, in _add_annots
annot['URI'] = annot['A']['URI']
TypeError: 'PDFObjRef' object has no attribute '__getitem__'
Below is a list with just a couple of pdf's that raises the above error:
http://www.genomecanada.ca/medias/pdf/en/genomesciencescentrebc.pdf
http://www.genomecanada.ca/medias/pdf/fr/genomesciencescentrebc.pdf
http://www.genomecanada.ca/medias/pdf/en/universityvictoria.pdf
http://www.genomecanada.ca/medias/pdf/fr/universityvictoria.pdf
http://www.genomecanada.ca/medias/pdf/fr/centreforappliedgenomicsogi.pdf
Maybe someone will be able to find a fix for it?
Thanks!
jcushman commented
Thanks for the report. This is fixed in v.0.4.1.