metachris/pdfx

TypeError: '<' not supported between instances of 'tuple' and 'int'

Closed this issue · 6 comments

Getting an error while passing a url in PDFx function.

Here is the traceback:

 File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfx/__init__.py", line 127, in __init__
    self.reader = PDFMinerBackend(self.stream)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfx/backends.py", line 167, in __init__
    doc = PDFDocument(parser, password=password, caching=True)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 558, in __init__
    self.read_xref_from(parser, pos, self.xrefs)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 782, in read_xref_from
    xref.load(parser)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 235, in load
    (_, stream) = parser.nextobject()
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/psparser.py", line 582, in nextobject
    (pos, token) = self.nexttoken()
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/psparser.py", line 508, in nexttoken
    self.fillbuf()
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/psparser.py", line 232, in fillbuf
    if self.charpos < len(self.buf):
TypeError: '<' not supported between instances of 'tuple' and 'int'

Similar bug here pdfminer/pdfminer.six#89

When I make the change for the bug fix given in the issue above a new bug pops up.

TypeError: int() argument must be a string, a bytes-like object or a number, not 'PSKeyword'

  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfx/__init__.py", line 127, in __init__
    self.reader = PDFMinerBackend(self.stream)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfx/backends.py", line 167, in __init__
    doc = PDFDocument(parser, password=password, caching=True)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 558, in __init__
    self.read_xref_from(parser, pos, self.xrefs)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 782, in read_xref_from
    xref.load(parser)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 235, in load
    (_, stream) = parser.nextobject()
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/psparser.py", line 624, in nextobject
    self.do_keyword(pos, token)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfparser.py", line 77, in do_keyword
    (objid, genno) = (int(objid), int(genno))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'PSKeyword'

Along with this.

 line 127, in __init__
    self.reader = PDFMinerBackend(self.stream)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfx/backends.py", line 167, in __init__
    doc = PDFDocument(parser, password=password, caching=True)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 558, in __init__
    self.read_xref_from(parser, pos, self.xrefs)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 782, in read_xref_from
    xref.load(parser)
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/pdfdocument.py", line 235, in load
    (_, stream) = parser.nextobject()
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/psparser.py", line 582, in nextobject
    (pos, token) = self.nexttoken()
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/psparser.py", line 508, in nexttoken
    self.fillbuf()
  File "/home/siddhartha/anaconda3/envs/straikit/lib/python3.6/site-packages/pdfminer/psparser.py", line 238, in fillbuf
    raise PSEOF('Unexpected EOF')



File "/home/zy/miniconda3/envs/py36pc2t/lib/python3.6/site-packages/pdfminer/pdfinterp.py", line 248, in fillbuf
if self.charpos < len(self.buf):
TypeError: '<' not supported between instances of 'tuple' and 'int'

Also have this problem

This is fixed in pdfminer.six by pdfminer/pdfminer.six#134

Fixed in v1.4.1, thanks!