Coverage function to turn df into Rle?
Closed this issue · 4 comments
endrebak commented
The original IRanges library has a function called coverage to turn interval data into an Rle. Will you consider implementing a coverage function?
Thanks for the library, btw.
phaverty commented
Hi Endre,
I'm glad you are benefitting from the package. My GenomicVectors.jl package
has a coverage method that returns an RleVector. I think there is room for
optimization. Please let me know how it works for you.
Pete
…____________________
Peter M. Haverty, Ph.D.
Genentech, Inc.
phaverty@gene.com
On Wed, Apr 11, 2018 at 9:29 PM, Endre Bakken Stovner < ***@***.***> wrote:
The original IRanges library has a function called coverage to turn
interval data into an Rle. Will you consider implementing a coverage
function?
Thanks for the library, btw.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#12>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AH02K6h1B5_wKiyVrSE34aE5OfktBm0Iks5tntg7gaJpZM4TRH7X>
.
endrebak commented
Thanks. I searched the wrong repo then.
My intention is to translate it into Cython - I'll be sure to give you credit. I have an implementation myself, but it uses two heaps.
endrebak commented
Julia looks nice! I need to learn it to be able to understand how you can write a coverage-function as simply as this, because I am guessing you are not instantiating a vector of chromosome length:
function coverage(gr::AbstractGenomicVector)
out = RLEVector(0, last(chr_ends(gr)))
for (s,e) in eachrange(gr)
out[s:e] += 1
end
out
end
phaverty commented
Hi Endre,
The subtypes of AbstractGenomicVector contain a GenomeInfo object, which
contains the chromosome lengths (actually, the cumsum of the lengths). This
allows AbstractGenomicVector to store the genome base positions as the
offset from the first base of chr1. That way we can do (almost) everything
on one scale without worrying about chromosomes. And since julia has 64bit
ints we can have one RLE that spans the genome. (R only has 32 bit, signed
integers, which are to small to hold an index into the last few
chromosomes).
Pete
…____________________
Peter M. Haverty, Ph.D.
Genentech, Inc.
phaverty@gene.com
On Fri, Apr 13, 2018 at 4:55 AM, Endre Bakken Stovner < ***@***.***> wrote:
Julia looks nice! I need to learn it to be able to understand how you can
write a coverage-function as simply as this, because I am guessing you are
not instantiating a vector of chromosome length:
function coverage(gr::AbstractGenomicVector)
out = RLEVector(0, last(chr_ends(gr)))
for (s,e) in eachrange(gr)
out[s:e] += 1
end
out
end
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#12 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AH02K3B4VVEC0iIJt-DS90Xz_KQdcVInks5toJIigaJpZM4TRH7X>
.