Read comment
esteinig opened this issue · 4 comments
Hey nice tool - quite useful to extend pyfaidx
to Fastq
. Is there any chance you could implement to read the comment on a read header?
Currently the only accessible attribute is read.name
when iterating over pyfastx.Fastq
Also on that note is there a function to write the complete read back to file, something like:
for read in fai:
output.write(str(read))
This will write the sequence, but not the complete read.
Here is simple Python function for now:
def build_read_string(read, fastq: bool = False, comment: str = None):
""" Build read string from pyfastx read """
if fastq:
return f"@{read.name}{' '+comment if comment else ''}" \
f"\n{read.seq}\n+\n{read.qual}"
else:
return f">{read.name}\n{read.seq}"
Good suggestion! In later versions, I will consider adding a ".raw" attribution to read and sequence object to get raw string as it appeared in file. But I am not sure if the read comment is important. In many fastq files, the comment line only contains a '+' char.
Thanks that's great to hear! I was imprecise when I said comment, which was a reference to the pysam
comment read attribute, containing the content after the read name. Sometimes it contains useful information, for example when generating Fastq files from nanopore basecalling:
@8dc817b4-9485-4b09-884f-c5b4fd741d75 runid=9e281aa698a86f2cde7f5c6db95cdfa8b3edd3ff read=58861 ch=178 start_time=2019-07-30T21:52:20Z
In this case it would be useful to be able to access the string after the @name
from the fields runid
to start_time