
example in vignette error

if I follow the example in the vignette I encounter this error:

Add IDs

targets$IDFILE <- list.files(paste0(tempdir(), "/GSE70970/Data"))

Attaching package: 'NACHO'

The following object is masked from 'package:BiocGenerics':


GSE70970_sum <- summarise(

  • data_directory = paste0(tempdir(), "/GSE70970/Data"), # Where the data is
  • ssheet_csv = targets, # The samplesheet
  • id_colname = "IDFILE", # Name of the column that contains the identfiers
  • housekeeping_genes = NULL, # Custom list of housekeeping genes
  • housekeeping_predict = TRUE, # Predict the housekeeping genes based on the data?
  • normalisation_method = "GEO", # Geometric mean or GLM
  • n_comp = 5 # Number indicating the number of principal components to compute.
  • )
    [NACHO] Importing RCC files.
    Error: Column cols must be length 1 (the number of rows), not 3


I can't replicate your error.
And the vignette successfully compiled as you can see on the website

Below is a full reproducible example of the code you mentionned, as you can see I don't have your error. Please check the session information in the end.

#> Loading required package: Biobase
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel':
#>     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#>     clusterExport, clusterMap, parApply, parCapply, parLapply,
#>     parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:stats':
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>     anyDuplicated, append,, basename, cbind,
#>     colnames, dirname,, duplicated, eval, evalq, Filter,
#>     Find, get, grep, grepl, intersect, is.unsorted, lapply, Map,
#>     mapply, match, mget, order, paste, pmax,, pmin,
#>, Position, rank, rbind, Reduce, rownames, sapply,
#>     setdiff, sort, table, tapply, union, unique, unsplit, which,
#>     which.max, which.min
#> Welcome to Bioconductor
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> Setting options('download.file.method.GEOquery'='auto')
#> Setting options('GEOquery.inmemory.gpl'=FALSE)
# Download data
gse <- getGEO("GSE70970")
#> Found 1 file(s)
#> GSE70970_series_matrix.txt.gz
#> Parsed with column specification:
#> cols(
#>   .default = col_double(),
#>   ID_REF = col_character()
#> )
#> See spec(...) for full column specifications.
#> File stored at:
#> /tmp/RtmpKA2y6S/GPL20699.soft
# Get phenotypes
targets <- pData(phenoData(gse[[1]]))
getGEOSuppFiles(GEO = "GSE70970", baseDir = tempdir())
#>                                                                    size
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       1986560
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz     672
#>                                                                 isdir mode
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       FALSE  644
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz FALSE  644
#>                                                                               mtime
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       2019-11-15 11:25:23
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz 2019-11-15 11:25:24
#>                                                                               ctime
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       2019-11-15 11:25:23
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz 2019-11-15 11:25:24
#>                                                                               atime
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       2019-11-15 11:25:21
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz 2019-11-15 11:25:23
#>                                                                  uid gid
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       1738  50
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz 1738  50
#>                                                                    uname
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       mcanouil
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz mcanouil
#>                                                                 grname
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                        staff
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz  staff
# Unzip data
  tarfile = paste0(tempdir(), "/GSE70970/GSE70970_RAW.tar"), 
  exdir = paste0(tempdir(), "/GSE70970/Data")
# Add IDs
targets$IDFILE <- list.files(paste0(tempdir(), "/GSE70970/Data"))

#> Attaching package: 'NACHO'
#> The following object is masked from 'package:BiocGenerics':
#>     normalize
GSE70970_sum <- summarise(
  data_directory = paste0(tempdir(), "/GSE70970/Data"), # Where the data is
  ssheet_csv = targets, # The samplesheet
  id_colname = "IDFILE", # Name of the column that contains the identfiers
  housekeeping_genes = NULL, # Custom list of housekeeping genes
  housekeeping_predict = TRUE, # Predict the housekeeping genes based on the data?
  normalisation_method = "GEO", # Geometric mean or GLM
  n_comp = 5 # Number indicating the number of principal components to compute. 
#> [NACHO] Importing RCC files.
#> [NACHO] Performing QC and formatting data.
#> [NACHO] Searching for the best housekeeping genes.
#> [NACHO] Computing normalisation factors using "GEO" method for housekeeping genes prediction.
#> [NACHO] The following predicted housekeeping genes will be used for normalisation:
#>   - hsa-miR-103
#>   - hsa-let-7e
#>   - hsa-miR-1260
#>   - hsa-miR-500+hsa-miR-501-5p
#>   - hsa-miR-1274b
#> [NACHO] Computing normalisation factors using "GEO" method.
#> [NACHO] Missing values have been replaced with zeros for PCA.
#> [NACHO] Normalising data using "GEO" method with housekeeping genes.
#> [NACHO] Returning a list.
#>   $ access              : character
#>   $ housekeeping_genes  : character
#>   $ housekeeping_predict: logical
#>   $ housekeeping_norm   : logical
#>   $ normalisation_method: character
#>   $ remove_outliers     : logical
#>   $ n_comp              : numeric
#>   $ data_directory      : character
#>   $ pc_sum              : data.frame
#>   $ nacho               : data.frame
#>   $ outliers_thresholds : list
#>   $ raw_counts          : data.frame
#>   $ normalised_counts   : data.frame

I restarted R and tried again now it worked, sry dont know what went wrong the first time.

best regards

gse <- getGEO("GSE70970")
Found 1 file(s)
trying URL ''
Content type 'application/x-gzip' length 351607 bytes (343 KB)
downloaded 343 KB

Parsed with column specification:
.default = col_double(),
ID_REF = col_character()
See spec(...) for full column specifications.
File stored at:

targets <- pData(phenoData(gse[[1]]))
getGEOSuppFiles(GEO = "GSE70970", baseDir = tempdir())
trying URL ''
Content type 'application/x-tar' length 1986560 bytes (1.9 MB)
downloaded 1.9 MB

trying URL ''
Content type 'application/x-gzip' length 672 bytes

downloaded 672 bytes

                                                               size isdir mode               mtime               ctime

/tmp/RtmpQb9ReH/GSE70970/GSE70970_RAW.tar 1986560 FALSE 664 2019-11-15 11:31:34 2019-11-15 11:31:34
/tmp/RtmpQb9ReH/GSE70970/GSE70970_characteristics_readme.txt.gz 672 FALSE 664 2019-11-15 11:31:35 2019-11-15 11:31:35
atime uid gid uname grname
/tmp/RtmpQb9ReH/GSE70970/GSE70970_RAW.tar 2019-11-15 11:31:32 1000 1000 sebastian sebastian
/tmp/RtmpQb9ReH/GSE70970/GSE70970_characteristics_readme.txt.gz 2019-11-15 11:31:34 1000 1000 sebastian sebastian


  • tarfile = paste0(tempdir(), "/GSE70970/GSE70970_RAW.tar"),
  • exdir = paste0(tempdir(), "/GSE70970/Data")
  • )

targets$IDFILE <- list.files(paste0(tempdir(), "/GSE70970/Data"))

Attaching package: ‘NACHO’

The following object is masked from ‘package:BiocGenerics’:


GSE70970_sum <- summarise(

  • data_directory = paste0(tempdir(), "/GSE70970/Data"), # Where the data is
  • ssheet_csv = targets, # The samplesheet
  • id_colname = "IDFILE", # Name of the column that contains the identfiers
  • housekeeping_genes = NULL, # Custom list of housekeeping genes
  • housekeeping_predict = TRUE, # Predict the housekeeping genes based on the data?
  • normalisation_method = "GEO", # Geometric mean or GLM
  • n_comp = 5 # Number indicating the number of principal components to compute.
  • )
    [NACHO] Importing RCC files.
    |========================================================================================================|100% ~0 s remaining
    [NACHO] Performing QC and formatting data.
    [NACHO] Searching for the best housekeeping genes.
    [NACHO] Computing normalisation factors using "GEO" method for housekeeping genes prediction.
    [NACHO] The following predicted housekeeping genes will be used for normalisation:
    • hsa-miR-103
    • hsa-let-7e
    • hsa-miR-1260
    • hsa-miR-500+hsa-miR-501-5p
    • hsa-miR-1274b
      [NACHO] Computing normalisation factors using "GEO" method.
      [NACHO] Missing values have been replaced with zeros for PCA.
      [NACHO] Normalising data using "GEO" method with housekeeping genes.
      [NACHO] Returning a list.
      $ access : character
      $ housekeeping_genes : character
      $ housekeeping_predict: logical
      $ housekeeping_norm : logical
      $ normalisation_method: character
      $ remove_outliers : logical
      $ n_comp : numeric
      $ data_directory : character
      $ pc_sum : data.frame
      $ nacho : data.frame
      $ outliers_thresholds : list
      $ raw_counts : data.frame
      $ normalised_counts : data.frame

Enjoy NACHO ;)

Hi Mcanouil,

Restarted R and tried to run the code fresh again. Still the same error!
`> GSE70970_sum <- summarize(

  • data_directory = paste0(tempdir(), "/GSE70970/Data"), # Where the data is
  • ssheet_csv = targets, # The samplesheet
  • id_colname = "IDFILE", # Name of the column that contains the identfiers
  • housekeeping_genes = NULL, # Custom list of housekeeping genes
  • housekeeping_predict = TRUE, # Predict the housekeeping genes based on the data?
  • normalisation_method = "GEO", # Geometric mean or GLM
  • n_comp = 5 # Number indicating the number of principal components to compute. 
  • )`

Error goes like this : [NACHO] Importing RCC files. Error: Column cols must be length 1 (the number of rows), not 3

Any other solutions?
Thanks for quick response.
