This crate does what Python's msgpack-numpy does in Rust, and a lot faster. It serializes and deserializes NumPy scalars and arrays to and from the MessagePack format, in the same serialized formats as the Python counterpart, so they could interoperate with each other. It enables processing NumPy arrays in a different service in Rust through IPC, or saving Machine Learning results to disk (better paired with compression).
- It supports
bool
,u8
,i8
,u16
,i16
,f16
(through thehalf
crate),u32
,i32
,f32
,u64
,i64
,f64
. - No support for arrays with complex numbers (
'c'
), byte strings ('S'
), unicode strings ('U'
), or other non-primitive types as elements. No support for structured/tuple data types ('V'
), or object-type data that need pickling ('O'
) (ref). - However, during deserialization, we allow unsupported types to be deserialized as the
Unsupported
variant. This ensures deserialization can continue and the supported portions of data can be used. - Scalars and arrays are represented as separate types, each of which being an enum of different element type variants. They come with convenient conversion methods (backed by the
num-traits
crate) to the desired target primitive types. Example:f16
,f32
,f64
can all be converted tof64
, orf16
with loss. This allows flexibility during deserialization, without explicit pattern matching and conditional conversion. It would be similar to NumPy's.astype(np.float64)
/.astype(np.float16)
. Notably,bool
is convertible to numeric types as(0, 1)
, but not from numeric types using these methods. Of course, you can do your own conversion after matching with theBool
variant. - Arrays use the
ndarray
crate, and have dynamic shapes. This enables users to leverage Rust's numeric ecosystem for the deserialized arrays. - Array handling using
CowNDArray
could be zero-copy when array buffers in the serialized slice have good alignment, although MessagePack doesn't guarantee this. - It depends on
serde
. In addition, it makes sense to use a correct MessagePack implementation, such asrmp-serde
, which is used in the examples below, although it doesn't need to be a dependency, due toserde
's design.
There hasn't been consensus on a good format that is both flexible and efficient for serializing NumPy arrays. They are unique in that they are blocks of bytes in nature, but also have numeric types and shapes. Programmers working on Machine Learning problems found MessagePack to have interesting properties. It is compact with a type system, and has a wide range of language support. The package msgpack-numpy provides de-/serialization for NumPy arrays, standalone or enclosed in arbitrary organizational depths, to be sent over the network, or saved to disk, in a compact format.
If one looks for a more production-oriented, performant format, they might consider Apache Arrow, Parquet, or Protocol Buffers. However, these formats are not as flexible as MessagePack when you need to store intermediate Machine Learning results. In practice, MessagePack with Numpy array support can be quite a good choice for many of these use cases.
This Rust version aims to provide a faster alternative to the Python version, with the same serialized formats as the Python counterpart so they could interoperate with each other. You could use this as a building block for your own Machine Learning pipeline in Rust, or as a way to communicate between Python and Rust.
use std::fs::File;
use std::io::Read;
use msgpack_numpy::NDArray;
fn main() {
let filepath = "tests/data/ndarray_bool.msgpack";
let mut file = File::open(filepath).unwrap();
let mut buf = Vec::new();
file.read_to_end(&mut buf).unwrap();
let deserialized: NDArray = rmp_serde::from_slice(&buf).unwrap();
match &deserialized {
NDArray::Bool(array) => {
println!("{:?}", array);
}
_ => panic!("Expected NDArray::Bool"),
}
// returns an Option, None if conversion is not possible
let arr = deserialized.into_u8_array().unwrap();
println!("{:?}", arr);
}
Please see more in examples/
.
All benchmarks were done with 1 CPU core on a Ubuntu 22.04 instance. CPUs: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz. The Rust version was compiled in release mode. We are only benchmarking the serialization and deserialization of arrays, in memory. See benches/
for the benchmark code.
This applies to the owned NDArray
.
Array Type | Array Size | Arrays | Operation | Python (ms) | Rust (ms) | Speedup |
---|---|---|---|---|---|---|
f32 | 1000 | 10000 | Serialize | 56.4 | 17.1 | 3.3x |
Deserialize | 26.1 | 18.9 | 1.4x | |||
100 | 100000 | Serialize | 226.1 | 27.1 | 8.3x | |
Deserialize | 199.3 | 50.5 | 3.9x | |||
f16 | 1000 | 10000 | Serialize | 33.5 | 4.0 | 8.5x |
Deserialize | 21.2 | 5.2 | 4.1x | |||
100 | 100000 | Serialize | 198.9 | 12.1 | 16.5x | |
Deserialize | 195.2 | 29.5 | 6.6x |
The Rust implementation shows significant performance improvements over Python in all cases, with particularly dramatic speedups for small array serialization. The Python version's de-/serialization logic is written in C through NumPy, but small arrays reduce this benefit because each array is a Python object. Notably, the Python version deserializes faster than serializing, while the Rust version serializes faster than deserializing. This range of array sizes is typical for Machine Learning use cases, such as feature embeddings, so Rust will be able to help out when performance is needed.
For the above arrays, the array buffers always seem to be misaligned during deserialization, so we can't just borrow the data from the serialized slice as the targeted typed array, but instead pay for extra allocation. This is because the MessagePack format doesn't guarantee alignment.
In most cases however, there are good chances of alignment, and we could borrow the array buffer data directly when that happens. This is demonstrated in the following benchmarks. We choose CowNDArray
, shape (1024, 2048)
, 10 arrays each time for demonstration.
Data Type | Operation | Python (ms) | Rust (ms) | Speedup |
---|---|---|---|---|
f16 | Serialize | 42.8 | 23.4 | 1.8x |
Deserialize (NDArray ) |
21.6 | 20.4 | 1.1x | |
Deserialize (CowNDArray ) |
- | 10.5 | 2.1x | |
f32 | Serialize | 87.8 | 43.5 | 2.0x |
Deserialize (NDArray ) |
44.2 | 41.4 | 1.1x | |
Deserialize (CowNDArray ) |
- | 34.5 | 1.3x |
Deserialization time went down! For f16
, it's about half the chance for good alignment, and 1/4 for f32
. The amortized cost of allocation is now lower, and we can see the benefit of zero-copy deserialization. The shortcoming is, CowNDArray
only supports rmp_serde::from_slice
(consuming from a slice that's fully in memory), but not rmp_serde::from_read
(consuming from a reader in a streaming way). So you need to keep the serialized bytes (the compiler will check).
If you really want complete zero-copy deserialization, you should try some other format, like Apache Arrow.
There is not a good reason to serialize using Scalar
, because you end up representing primitive types with a lot of metadata. This type exists for compatibility reasons - it helps deserialize scalars already serialized this way.
This crate uses types from ndarray
in its public API. ndarray
is re-exported in the crate root so that you do not need to specify it as a direct dependency.
Furthermore, this crate is compatible with multiple versions of ndarray
and therefore depends on a range of semver-incompatible versions, currently >=0.15, <0.17
. Cargo does not automatically choose a single version of ndarray
by itself if you depend directly or indirectly on anything but that exact range. In other words, this crate will get 0.16.1
as its own, separate dependency, even if you pin ndarray
to 0.15.6
in your own project. This might come as a surprise, and you will get compilation errors like:
= note: `ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>>` and `ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>>` have similar names, but are actually distinct types
note: `ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>>` is defined in crate `ndarray`
--> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ndarray-0.15.6/src/lib.rs:1268:1
|
1268 | pub struct ArrayBase<S, D>
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
note: `ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>>` is defined in crate `ndarray`
--> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ndarray-0.16.1/src/lib.rs:1280:1
|
1280 | pub struct ArrayBase<S, D>
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
= note: perhaps two different versions of crate `ndarray` are being used?
It can therefore be necessary to manually unify these dependencies. For example, if you specify the following dependencies
msgpack-numpy = "0.1.3"
ndarray = "0.15.6"
this will currently depend on both version 0.15.6
and 0.16.1
of ndarray
by default even though 0.15.6
is within the range >=0.15, <0.17
. To fix this, you can run
cargo update --package ndarray:0.16.1 --precise 0.15.6
to achieve a single dependency on version 0.15.6
of ndarray. Check your lock file to verify that this worked.
This project is licensed under the MIT license.