/fastMatMR

R bindings to fast_matrix_market for reading and writing .mtx files

Primary LanguageC++OtherNOASSERTION

fastMatMR

CRAN_Status_Badge Status at rOpenSci Software Peer Review Lifecycle: stable runiverse-name runiverse-package DOI R-CMD-check pkgcheck

About

fastMatMR provides R bindings for reading and writing to Matrix Market files using the high-performance fast_matrix_market C++ library (version 1.7.4).

Why?

Matrix Market files are crucial to much of the data-science ecosystem. The fastMatMR package focuses on high-performance read and write operations for Matrix Market files, serving as a key tool for data extraction in computational and data science pipelines.

The target audience and scientific applications primarily include data scientists or researchers developing numerical methods who may wish to either test standard NIST (National Institute of Standards and Technology) which include:

comparative studies of algorithms for numerical linear algebra, featuring nearly 500 sparse matrices from a variety of applications, as well as matrix generation tools and services.

Additionally, being able to use the matrix market file format, means it is easier to interface R analysis with those in Python (e.g. SciPy uses the same underlying C++ library). These files can also be used with the Tensor Algebra Compiler (TACO).

Features

  • Extended Support: fastMatMR supports standard R vectors, matrices, as well as Matrix sparse objects.

  • Performance: The package is a thin wrapper around one of the fastest C++ libraries for reading and writing .mtx files.

  • Correctness: Unlike Matrix, roundtripping with NA and NaN values works by coercing to NaN instead of to arbitrarily high numbers.

We have vignettes for both read and write operations to demonstrate the performance claims.

Alternatives and statement of need

  • The Matrix package allows reading and writing sparse matrices in the .mtx (matrix market) format.
    • However, for .mtx files, it can only handles sparse matrices for writing and reading.
    • Round-tripping (writing and subsequently reading) data with NA and NaN values produces arbitrarily high numbers instead of preserving NaN / handling NA

Installation

CRAN

For the latest CRAN version:

install.packages("fastMatMR")

R-Universe

For the latest development version of fastMatMR:

install.packages("fastMatMR",
                 repos = "https://ropensci.r-universe.dev")

Development Git

For the latest commit, one can use:

# install.packages("devtools")
devtools::install_github("ropensci/fastMatMR")

Quick Example

library(fastMatMR)
spmat <- Matrix::Matrix(c(1, 0, 3, 2), nrow = 2, sparse = TRUE)
write_fmm(spmat, "sparse.mtx")
fmm_to_sparse_Matrix("sparse.mtx")

The resulting .mtx file is language agnostic, and can even be read back in python as an example:

pip install fast_matrix_market
python -c 'import fast_matrix_market as fmm; print(fmm.read_array_or_coo("sparse.mtx"))'
((array([1., 3., 2.]), (array([0, 0, 1], dtype=int32), array([0, 1, 1], dtype=int32))), (2, 2))
python -c 'import fast_matrix_market as fmm; print(fmm.read_array("sparse.mtx"))'
array([[1., 3.],
       [0., 2.]])

Similarly, fastMatMR supports writing and reading from other R objects (e.g. standard R vectors and matrices), as seen in the getting started vignette.

Contributing

Contributions are very welcome. Please see the Contribution Guide and our Code of Conduct.

License

This project is licensed under the MIT License.

Logo

The logo was generated via a non-commercial use prompt on hotpot.ai, both blue, and green, as a riff on the NIST Matrix Market logo. The text was added in a presentation software (WPS Presentation). Hexagonal cropping was accomplished in a hexb compatible design using hexsticker.