add support for multi-fasta files give (as mixture) in screen
Opened this issue · 0 comments
ndaniel commented
As it is now it is not feasible to run mash screen
with 100K mixtures because one needs to launch mash 100K times. mash dist
has support for multi-fasta files but does not have winner-take-all strategy.
Therefore it would be great if ``mash screen` would support multi-fasta files as input such that
mash screen queries.msh 1.fa
mash screen queries.msh 2.fa
mash screen queries.msh 3.fa
mash screen queries.msh 4.fa
could be given as
mash screen --multi-fasta-mixtures queries.msh total.fa
where
cat 1.fa 2.fa 3.fa 4.fa > total.fa
and files 1.fa, 2.fa, 3.fa, and 4.fa contain each one sequence.
Basically, adding support for this feature would make mash screen
behave like "mash dist
with winner-take-all strategy". Therefore this could be implemented alternatively also by adding support for winner-take-all strategy to mash dist
(ie. mash dist --winner-take-all
).