PacificBiosciences/pbbioconda

isoseq cluster2 crashes on "truncated" file generated by isoseq refine

Closed this issue · 4 comments

Operating system
Which operating system and version are you using?
Red Hat Enterprise Linux 8.6 (Ootpa)

Conda environment

# Name                    Version                   Build  Channel
isoseq                    4.2.0                h9ee0642_0    bioconda
lima                      2.12.0               h9ee0642_1    bioconda
pbmm2                     1.16.0               h9ee0642_0    bioconda

Describe the bug
Hi, I am trying to process some Kinnex RNASeq data.
After isoseq refine, which ends without error, isoseq cluster2 runs for a while and then stops, I think at the sorting step, complaining for the file (generated by refine) to be corrupted. I tried running it a couple of times from scratch but with no success. Could suggest how to fix this, please? :)
Thank you in advance!

Error message and to reproduce
pacbioerror.txt

Your input is corrupt.

[E::bgzf_read_block] Invalid BGZF header at offset 7865347630

Maybe there's an issue with you HPC? Is the data local or mounted via NFS?

Thanks for the reply!

Yes, the data is mounted via NFS. Do you have perhaps any pointers to see if it's an issue of the HPC?

Maybe the random access doesn't work properly. Is there any way to keep the intermediate files local to that node?

I'll talk with our admin. Thank you for the help!