thelovelab/tximport

can't read alevin output from salmon 0.14.0

alexvpickering opened this issue · 8 comments

Tried with latest version of tximport installed with devtools::install_github("mikelove/tximport") 1.13.3

tximport::tximport(file.path(alevin_dir, 'quants_mat.gz'), type = 'alevin')

Error in mat[, j] <- readBin(con, double(), endian = "little", n = num.genes) : 
  number of items to replace is not a multiple of replacement length 

Probably related to this:

The binary output format of alevin, quants_mat.gz, has been changed into a sparse single precision format. In pratice we saw the file size reduced to as big as half the size of the original file.

Hi @alexvpickering ,

Thanks for the issue.
We are still working on this, in the meantime, I will share a workable version soon.

Thanks for this report. We’ll take a look. @k3yavi and I have a meeting already for tomorrow to discuss the next iteration of the Alevin file format. And we can figure out how tximport needs to change to accommodate Salmon 0.14.

Hi @alexvpickering ,

Thanks again for raising the issue. As requested by one other user too, we have a shared a basic (non-optimized) R parser for the alevin 0.14.0 output in this COMBINE-lab/salmon#380 thread. We are still working on optimizing the parser and integrating it with tximport and will update you once it is stable.

Status update: I'm creating some test data for 0.14 so I can bring in Avi's code into tximport

I’ve got a little code but didn’t finish today and want to put in a unit test alongside the new code.

Thanks for notifying us quickly to this issue.

Should be fixed by 6f761a7

You can obtain the new code with install_github("mikelove/tximport"). If you can test that it works on your end, i'll push it to release as well.

I just pushed 1.12.2 to release branch as well, so it gets circulated tomorrow

Thanks @mikelove !