gmarcais/Jellyfish

jellyfish dump might miss some kmers?

YuanwenGuo opened this issue · 1 comments

Hi Jellyfish group,

I am trying to use jellyfish to generate kmer for a complete genome (~14G). I accidentally found some kmers should be in the genome but are not showing in the final txt file after dump. I checked some of these kmers with query command and it turned out they were in .jf file. But then after dump command, they can not be found in the final file.

My command are as follows:

###count
CORES=30 #number of cores to use for blast searches
KMERSIZE=71
hashsize=30G
threads=10

${jellyfishDir}/jellyfish count -o *.jf -m ${KMERSIZE} -t ${CORES} -s ${hashsize} -t ${threads} *.fasta

###dump
${jellyfishDir}/jellyfish dump -U 10 -ct *.jf > *.dump.txt

Could you please help me to figure out what is happening?

Thank you!

You are using the -U switch for dump, which does not output k-mers with a count above 10. Is that what you are missing?