eyzhao/hrdetect-pipeline

TAI score assumes no "chr" strings in segments.tsv

Opened this issue · 0 comments

In hrdscore/run_test.R, you are missing the usual instruction gsub("chr","",...) found elsewhere in your script files. As a result, if the "chr" strings are found in file segments.tsv, the TAI score won't be computed, as there will be no overlap with subtelomere regions.

From your hrdtools package:

subtelomeres <- GRanges(ddply(get_subtelomere_regions(), 'chromosome', function(z) {
return(data.frame(start = c(0, z$end), end = c(z$start, .Machine$integer.max)))
}))

with the default function get_subtelomere_regions <- function(chr_label = F) {...}

results in:

GRanges object with 52 ranges and 0 metadata columns:
seqnames ranges strand

[1] 1 [ 0, 10000] *
[2] 1 [249223345, 2147483647] *
[3] 10 [ 0, 82827] *
[4] 10 [135508492, 2147483647] *
[5] 11 [ 0, 116986] *
... ... ... ...
[48] 9 [141144172, 2147483647] *
[49] X [ 0, 182990] *
[50] X [155250482, 2147483647] *
[51] Y [ 0, 132990] *
[52] Y [ 59353488, 2147483647] *

This will not overlap segments with the "chr" string. There is also the X/23 and Y/24 issue.

Also could you provide the source or a reference for your subtelomere Rdata file?