readDNAStringSet fails with Legacy Mac (CR) line breaks
FabianRoger opened this issue · 2 comments
I got a help request from someone running iOS 10.12. The person couldn't figure out why readDNAStringSet
didn't load the correctly formatted fasta file. After some troubleshooting I found out that the file contained Legacy Mac (CR) line breaks which apparently aren't recognised as line breaks by readDNAStringSet
.
The function runs without error or warning but the result is meaningless (1, 0-width sequence).
Is it possible to support these line breaks or raise an informative error?
thanks for the great package!
Fabian
I didn't know you could run R/Bioconductor on iOS. Note that this is not a platform that we support or intend to support. In case you meant macOS 10.12, please note that starting with R 4.0, R and R/Bioconductor packages are only supported on macOS 10.13 (High Sierra) and higher.
the file contained Legacy Mac (CR) line breaks which apparently aren't recognised as line breaks by readDNAStringSet.
Unfortunately CR line terminators break the most basic Unix tools like cat
, more
, wc
, etc... They also break calls to the standard C library like fgets()
, or to the zlib C library like gzgets()
, both of which are used internally by readDNAStringSet()
. So supporting these terminators would complicate readDNAStringSet()
's underlying C code significantly and would very likely introduce a slow down.
I meant macOS 10.12. And I don't know how frequent the problem is, I just realized that it wasn't an easy to troubleshoot error (because no error was raised). Is there any option for checking for unsupported line-breaks and raising a warning? But I also understand if it's too much trouble for a possibly infrequent problem.