prepare_transcripts range width limit error
Opened this issue · 2 comments
sbresnahan commented
Version: isosceles v0.2.1
Running prepare_transcripts
with bulk ONT reads aligned to chr1 of GRCh38 with minimap2:
BAM <- "chr1.bam"
GTF <- "gencode.v45.primary_assembly.annotation.gtf"
GENOME <- "GCA_000001405.15_GRCh38_no_alt_analysis_set.fna"
bam_files <- c(Sample = BAM)
bam_parsed <- bam_to_read_structures(bam_files = bam_files,ncpu=4)
transcript_data <- prepare_transcripts(
gtf_file = GTF,
genome_fasta_file = GENOME,
bam_parsed = bam_parsed,
min_bam_splice_read_count = 2,
min_bam_splice_fraction = 0.01
)
Results in this error:
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'i' in selecting a method for function 'extractROWS': each range must have a width that is < 2^31
sbresnahan commented
This issue is resolved upstream by filtering reads longer than 2^31 with samtools view -e 'length(seq)<2147483648' -O BAM -o out.bam in.bam
, however it would nice if isosceles could do this length filtering on import.
timbitz commented
Thanks for pointing this out! We'll look into this-- I agree it would be nice not to have to hand filter for this.