Bioconductor/IRanges

IRanges constructor with FactorLists

Closed this issue · 3 comments

Hi @hpages,

I have a test failure for my package plyranges on the dev branch based on what I think is changes to IRanges, I'm not sure how to proceed so any ideas would be really helpful. Basically, I have some helpers of constructing Ranges from DFrames, and test case includes a column that has a FactorList. I've made a reprex of the problem below using the current dev version of IRanges:

suppressPackageStartupMessages(library(IRanges))

start <- 1:3 
width <- 2:4

grps <- FactorList("a", c("b", "c"), "d")

grps 
#> FactorList of length 3
#> [[1]] a
#> [[2]] b c
#> [[3]] d

# iranges constructor adding mcols
ir <- IRanges(start, width, grps = grps)
ir
#> IRanges object with 3 ranges and 1 metadata column:
#>           start       end     width |         grps
#>       <integer> <integer> <integer> | <FactorList>
#>   [1]         1         2         2 |             
#>   [2]         2         3         2 |             
#>   [3]         3         4         2 |
mcols(ir)$grps
#> FactorList of length 3
#> Error in RangeNSBS(x, start = start, end = end, width = width): the specified range is out-of-bounds

# iranges constructor without mcols
ir <- IRanges(start, width)
mcols(ir)[["grps"]] <- grps
ir
#> IRanges object with 3 ranges and 1 metadata column:
#>           start       end     width |         grps
#>       <integer> <integer> <integer> | <FactorList>
#>   [1]         1         2         2 |             
#>   [2]         2         3         2 |             
#>   [3]         3         4         2 |
mcols(ir)$grps
#> FactorList of length 3
#> Error in RangeNSBS(x, start = start, end = end, width = width): the specified range is out-of-bounds

Created on 2021-05-03 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.5 (2021-03-31)
#>  os       macOS Big Sur 10.16         
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_AU.UTF-8                 
#>  ctype    en_AU.UTF-8                 
#>  tz       Australia/Melbourne         
#>  date     2021-05-03                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version date       lib
#>  backports      1.2.1   2020-12-09 [1]
#>  BiocGenerics * 0.37.4  2021-05-03 [1]
#>  cli            2.5.0   2021-04-26 [1]
#>  crayon         1.4.1   2021-02-08 [1]
#>  digest         0.6.27  2020-10-24 [1]
#>  ellipsis       0.3.2   2021-04-29 [1]
#>  evaluate       0.14    2019-05-28 [1]
#>  fansi          0.4.2   2021-01-15 [1]
#>  fs             1.5.0   2020-07-31 [1]
#>  glue           1.4.2   2020-08-27 [1]
#>  highr          0.9     2021-04-16 [1]
#>  htmltools      0.5.1.1 2021-01-22 [1]
#>  IRanges      * 2.25.10 2021-05-03 [1]
#>  knitr          1.33    2021-04-24 [1]
#>  lifecycle      1.0.0   2021-02-15 [1]
#>  magrittr       2.0.1   2020-11-17 [1]
#>  pillar         1.6.0   2021-04-13 [1]
#>  pkgconfig      2.0.3   2019-09-22 [1]
#>  purrr          0.3.4   2020-04-17 [1]
#>  reprex         2.0.0   2021-04-02 [1]
#>  rlang          0.4.11  2021-04-30 [1]
#>  rmarkdown      2.7     2021-02-19 [1]
#>  S4Vectors    * 0.29.18 2021-05-03 [1]
#>  sessioninfo    1.1.1   2018-11-05 [1]
#>  stringi        1.5.3   2020-09-09 [1]
#>  stringr        1.4.0   2019-02-10 [1]
#>  styler         1.4.1   2021-03-30 [1]
#>  tibble         3.1.1   2021-04-18 [1]
#>  utf8           1.2.1   2021-03-12 [1]
#>  vctrs          0.3.8   2021-04-29 [1]
#>  withr          2.4.2   2021-04-18 [1]
#>  xfun           0.22    2021-03-11 [1]
#>  yaml           2.2.1   2020-02-01 [1]
#>  source                                    
#>  CRAN (R 4.0.2)                            
#>  Github (Bioconductor/BiocGenerics@cced297)
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.1)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  Github (Bioconductor/IRanges@a5258ca)     
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  Github (Bioconductor/S4Vectors@7593108)   
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#>  CRAN (R 4.0.2)                            
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

It seems to be something to do with compression?

suppressPackageStartupMessages(library(IRanges))

start <- 1:3 
width <- 2:4

grps <- FactorList("a", c("b", "c"), "d", compress = FALSE)

grps 
#> FactorList of length 3
#> [[1]] a
#> [[2]] b c
#> [[3]] d

# iranges constructor adding mcols
ir <- IRanges(start, width, grps = grps)
ir
#> IRanges object with 3 ranges and 1 metadata column:
#>           start       end     width |         grps
#>       <integer> <integer> <integer> | <FactorList>
#>   [1]         1         2         2 |            a
#>   [2]         2         3         2 |          b,c
#>   [3]         3         4         2 |            d
mcols(ir)$grps
#> FactorList of length 3
#> [[1]] a
#> [[2]] b c
#> [[3]] d

# iranges constructor without mcols
ir <- IRanges(start, width)
mcols(ir)[["grps"]] <- grps
ir
#> IRanges object with 3 ranges and 1 metadata column:
#>           start       end     width |         grps
#>       <integer> <integer> <integer> | <FactorList>
#>   [1]         1         2         2 |            a
#>   [2]         2         3         2 |          b,c
#>   [3]         3         4         2 |            d
mcols(ir)$grps
#> FactorList of length 3
#> [[1]] a
#> [[2]] b c
#> [[3]] d

Created on 2021-05-03 by the reprex package (v2.0.0)

Thanks @sa-lee . Problem was in a long standing bug (> 15-year old) in updateObject():

library(BiocGenerics)
updateObject(factor(c("a", "b", "c")))
# Object of class "factor"
# factor(0)
# Levels: a b c

Fixed in BiocGenerics 0.37.5.

H.

Thanks @hpages 😄