marbl/seqrequester

seqrequester simulate missing end bases of chromosomes

Closed this issue · 1 comments

Hi Brian,

Seqrequester simulation is very useful and easy-to-use. However, irrespective of the sequencing coverage I specify, it is always missing coverage over a few initial and a few ending bases of chromosomes. I'd expect this doesn't happen in real sequencing. Can you please suggest why this may be happening? Is this related to the underlying algorithm being used? Any work-around for this?

--Chirag

For those that didn't eavesdrop on our conversation (which would be everyone else in the world): This is caused by requiring a read be a specific length, then choosing a place to extract the read from.

I added a '-truncate' option that switches allows reads to be truncated by the end of the input sequence. This will cover the ends of sequences with reads, but they'll be shorter than the length distribution wants them to be. Internally, I'm still picking a length first, but then choosing a start position for the read from [-readLength .. sequenceLength], compared to [0 .. sequenceLength-readLength] used before.