lgautier/fastq-and-furious

TypeError: must be str, not bytes, Python 3.7.3

evanbiederstedt opened this issue · 1 comments

Hi there

This makes me think this may be a Python3.x error? I'm using Python version 3.7.3, and trying installing via pip and the github repo:


from fastqandfurious.fastqandfurious import entryfunc
from fastqandfurious import fastqandfurious

myFastq = "a/fastq/file.fq"

bufsize = 20000
with open(myFastq) as fh:
    it = fastqandfurious.readfastq_iter(fh, bufsize, entryfunc)
    for sequence in it:
        print(sequence)

Here is the error:


Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/Users/evanbiederstedt/Library/Python/3.7/lib/python/site-packages/fastqandfurious/fastqandfurious.py", line 191, in readfastq_iter
    npos = _entrypos(blob, offset, posbuffer)
  File "/Users/evanbiederstedt/Library/Python/3.7/lib/python/site-packages/fastqandfurious/fastqandfurious.py", line 53, in _entrypos
    headerbeg_i = blob.find(b'@', offset)
TypeError: must be str, not bytes

The issue is here in _entrypos(), https://github.com/lgautier/fastq-and-furious/blob/master/src/fastqandfurious.py#L65

def _entrypos(blob, offset, posbuffer):
    posbuffer[:] = ARRAY_INIT
    lblob = len(blob)
    # header
    headerbeg_i = blob.find(b'@', offset)
    posbuffer[0] = headerbeg_i
    ...

Perhaps this is a new issue? Let me know if I could provide more details and help debug.

So, this is also a README error, which is sort of obvious in retrospect.

The error is due to how one opens the FASTQ---users must open the file in binary. The following should fix the issue, using with open(myFastq, 'rb') as fh:

from fastqandfurious.fastqandfurious import entryfunc
from fastqandfurious import fastqandfurious

myFastq = "a/fastq/file.fq"

bufsize = 20000
with open(myFastq, 'rb') as fh:
    it = fastqandfurious.readfastq_iter(fh, bufsize, entryfunc)
    for sequence in it:
        print(sequence)

I'll revise in #6