This repository contains code for the paper beachmat: a Bioconductor C++ API for accessing high-throughput biological data from a variety of R matrix types by Lun et al. (2018).
The provided code will check the performance of different matrix types for row/column access, using simulated and real data sets. To run the tests on your machine, please read the following instructions.
- Install beachmat from Bioconductor.
- Enter
timings
and runR CMD INSTALL --clean package
. This requires installation of RcppArmadillo and RcppEigen.
timings/
contains scripts for timings (in milliseconds) for accessing data from different matrix representations.timings/chunking/
contains scripts for timing rechunking, as well as checking the chunk cache logic.memory/
contains scripts for memory usage for different matrix representations.miscellaneous
contains scripts to compare timings to R, and to verify the no-copy access method of RcppArmadillo and RcppEigen.
Enter real/zeisel
and download the count matrix for the Zeisel data set.
- Execute the
zeisel_time.R
script to generate timings (in milliseconds) for matrix access to this data. This will also determine memory usage for each matrix representation. - Execute the
detection_stats.R
script to generate timings (in milliseconds) for computing various cell- or gene-based statistics from this data.
Enter real/10X
and install TENxBrainData.
Read the README.md
file for order of evaluation of the various Rmarkdown scripts.