Converter matrix for a range of array formats (backends) in Python, focusing on sparse arrays.
This library is targeted at projects that want to support a wide range of array formats as input, output or for calculations. All array libraries already do support format detection, creation and export from and to various formats, but with different APIs, different sets of formats and different sets of supported features -- dtypes, shapes, device classes etc.
As an example, efficient conversion from sparse.COO
to cupyx.scipy.sparse.coo_matrix
can be done via cupyx.scipy.sparse.coo_matrix(sparse.COO.to_scipy_sparse())
.
However, both scipy.sparse.coo_matrix
and cupyx.scipy.sparse.coo_matrix
only support 2D arrays. On top of that, cupyx.scipy.sparse.coo_matrix
only supports floating point dtypes and bool
.
This project creates an unified API for all conversions between the supported formats and takes care of details such as using an efficient intermediate format, reshaping and dtype conversion.
- Supports Python 3.6 - 3.10
- Defines constants for format identifiers
- Various sets to group formats into categories:
- Dense vs sparse
- CPU vs CuPy-based
- nD vs 2D backends
- Efficiently detect format of arrays, including support for subclasses
- Get converter function for a pair of formats
- Convert to a target format
- Find most efficient conversion pair for a range of possible inputs and/or outputs
numpy.ndarray
numpy.matrix
-- to support result of aggregation operations on scipy.sparse matricescupy.ndarray
sparse.COO
sparse.GCXS
sparse.DOK
scipy.sparse.coo_matrix
scipy.sparse.csr_matrix
scipy.sparse.csc_matrix
cupyx.scipy.sparse.coo_matrix
cupyx.scipy.sparse.csr_matrix
cupyx.scipy.sparse.csc_matrix
- cupyx.sparse formats with dtype
bool
- PyTorch arrays
- SciPy sparse arrays as opposed to SciPy sparse matrices.
This project is developed primarily for sparse data support in LiberTEM. For that reason it includes
the backend CUDA
, which indicates a NumPy array, but targeting execution on a CUDA device.