can't read alevin output from salmon 0.14.0
alexvpickering opened this issue · 8 comments
Tried with latest version of tximport
installed with devtools::install_github("mikelove/tximport")
1.13.3
tximport::tximport(file.path(alevin_dir, 'quants_mat.gz'), type = 'alevin')
Error in mat[, j] <- readBin(con, double(), endian = "little", n = num.genes) :
number of items to replace is not a multiple of replacement length
Probably related to this:
The binary output format of alevin, quants_mat.gz, has been changed into a sparse single precision format. In pratice we saw the file size reduced to as big as half the size of the original file.
Hi @alexvpickering ,
Thanks for the issue.
We are still working on this, in the meantime, I will share a workable version soon.
Thanks for this report. We’ll take a look. @k3yavi and I have a meeting already for tomorrow to discuss the next iteration of the Alevin file format. And we can figure out how tximport needs to change to accommodate Salmon 0.14.
Hi @alexvpickering ,
Thanks again for raising the issue. As requested by one other user too, we have a shared a basic (non-optimized) R parser for the alevin 0.14.0 output in this COMBINE-lab/salmon#380 thread. We are still working on optimizing the parser and integrating it with tximport
and will update you once it is stable.
Status update: I'm creating some test data for 0.14 so I can bring in Avi's code into tximport
I’ve got a little code but didn’t finish today and want to put in a unit test alongside the new code.
Thanks for notifying us quickly to this issue.
Should be fixed by 6f761a7
You can obtain the new code with install_github("mikelove/tximport")
. If you can test that it works on your end, i'll push it to release as well.
I just pushed 1.12.2 to release branch as well, so it gets circulated tomorrow