problems with AtomicList in mcols in Rdevel
Closed this issue · 3 comments
plger commented
Happens only with R-devel:
> library(GenomicRanges)
> library(S4Vectors)
> library(IRanges)
> gr <- GRanges("chr1", IRanges(1:5, width=10))
> fl <- FactorList(lapply(1:5, FUN=function(x) sample(LETTERS,x)))
> fl
FactorList of length 5
[[1]] W
[[2]] P Q
[[3]] B V Y
[[4]] V M N Y
[[5]] T E K B O
> gr$fl <- fl
> gr
GRanges object with 5 ranges and 1 metadata column:
seqnames ranges strand | fl
<Rle> <IRanges> <Rle> | <FactorList>
[1] chr1 1-10 * |
[2] chr1 2-11 * |
[3] chr1 3-12 * |
[4] chr1 4-13 * |
[5] chr1 5-14 * |
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> gr$fl
FactorList of length 5
Error in RangeNSBS(x, start = start, end = end, width = width) :
the specified range is out-of-bounds
This works fine:
DataFrame(fl=fl)
DataFrame with 5 rows and 1 column
fl
<FactorList>
1 X
2 I,E
3 T,F,R
4 I,V,J,...
5 Y,N,F,...
This also works:
> fl <- FactorList(lapply(1:5, FUN=function(x) sample(LETTERS,x)), compress=FALSE)
> mcols(gr) <- NULL
> gr$fl <- fl
> gr
GRanges object with 5 ranges and 1 metadata column:
seqnames ranges strand | fl
<Rle> <IRanges> <Rle> | <FactorList>
[1] chr1 1-10 * | M
[2] chr1 2-11 * | G,K
[3] chr1 3-12 * | B,Y,Z
[4] chr1 4-13 * | G,A,L,...
[5] chr1 5-14 * | C,O,X,...
Suggesting that it is related to compression. However, further down the line it seems it goes back to compressing it automatically, and I get errors like:
Error in validObject(result) :
invalid class "CompressedFactorList" object:
improper partitioning
> sessionInfo()
R Under development (unstable) (2021-04-08 r80148)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS
Matrix products: default
BLAS: /home/pigerm/applications/R-devel/lib/libRblas.so
LAPACK: /home/pigerm/applications/R-devel/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_CH.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_CH.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_CH.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] GenomicRanges_1.43.4 GenomeInfoDb_1.27.10 IRanges_2.25.7
[4] S4Vectors_0.29.15 BiocGenerics_0.37.1
loaded via a namespace (and not attached):
[1] zlibbioc_1.37.0 compiler_4.1.0 tools_4.1.0
[4] XVector_0.31.1 GenomeInfoDbData_1.2.4 RCurl_1.98-1.3
[7] bitops_1.0-6
plger commented
Bug was reproduced (by csoneson) on the following setup:
R Under development (unstable) (2021-03-29 r80130)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] GenomicRanges_1.43.4 GenomeInfoDb_1.27.11 IRanges_2.25.7 S4Vectors_0.29.15
[5] BiocGenerics_0.37.1
loaded via a namespace (and not attached):
[1] zlibbioc_1.37.0 compiler_4.1.0 XVector_0.31.1
[4] tools_4.1.0 GenomeInfoDbData_1.2.4 RCurl_1.98-1.3
[7] yaml_2.2.1 bitops_1.0-6
Instead, there is no bug on this one:
R Under development (unstable) (2021-04-05 r80145)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /home/stephany/r-devel/R-devel/lib/libRblas.so
LAPACK: /home/stephany/r-devel/R-devel/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_CH.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_CH.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_CH.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] GenomicRanges_1.43.4 GenomeInfoDb_1.27.8 IRanges_2.25.6
[4] S4Vectors_0.29.12 BiocGenerics_0.37.1
loaded via a namespace (and not attached):
[1] zlibbioc_1.37.0 compiler_4.1.0 tools_4.1.0
[4] XVector_0.31.1 GenomeInfoDbData_1.2.4 RCurl_1.98-1.3
[7] bitops_1.0-6
Which suggests that it's not strictly GenomicRanges-related, but perhaps S4Vectors
?
hpages commented
Thanks @plger for the report and sorry for the delay. We're going to take a look at this ASAP.
hpages commented
Fixed in BiocGenerics 0.37.5. See Bioconductor/IRanges#38 for the details.
H.