BiocPy/GenomicRanges

`_combine_GenomicRanges` and `_fast_combine_GenomicRanges` return wrong results

balwierz opened this issue · 2 comments

I really wished I could use this package, but the more I try fixing the bugs the more of them I find.

Consider this code:

a = GenomicRanges("A", IRanges([0], [10]))
b = GenomicRanges("B", IRanges([5], [15]))
a.union(b)

The result is

GenomicRanges with 1 range and 0 metadata columns
    seqnames    ranges          strand
         
[0]        A    0 - 20               *
------
seqinfo(1 sequences): A

The problem is that union calls _combine_GenomicRanges which does not deal with seqnames properly.
Actually, there is no code of joining two np.ndarrays with different meanings (different SeqInfo, i.e. different int->chromosomeName mapping).

Since other set operations depend on union, all of them will give wrong results.

Thank you for stress testing the package. I understand and hopefully, we'll resolve most of the issues soon. We've also been rewriting iranges and genomicranges by moving some of the functionality to C++ for speed and performance.

We try to fix issues quickly as they get reported and appreciate your help in posting these.

v0.4.33 fixes this. thanks again!