slowikj/seqR

Final k-mer frequency result format

slowikj opened this issue · 1 comments

Currently, an Rcpp::IntegerMatrix is returned, although the matrix would be sparse in the most of cases.

simple_triplet_matrix is an alternative solution, however, it also requires the normal matrix generation step, which wastes both CPU time and RAM resources.

Another solution is connected with Rcpp::Modules and creating a custom sparse data structure.

A prototype for RcppModules is implemented on the separate branch feature/rcpp-modules.
Due to the additional RcppModules complexity and limitations (for example: unlike R objects, C++ classes exported via RcppModules can't be serialized and deserialized between sessions (http://dirk.eddelbuettel.com/code/rcpp/Rcpp-modules.pdf)), I decided not to use this feature.

It turns out that it is possible to create simple_triplet_matrix without an extra full matrix generation step, using an appropriate constructor which has 4 parameters: row indices, column indices, value indices, dimnames.