Is it sound to use a slice of numpy::PyReadonlyArray inside pyo3::allow_threads()?
Guitheg opened this issue · 1 comments
Greetings,
My goal is to implement an efficient (and safe) way to wrap a Rust function, using numpy::PyReadonlyArray, with zero-copy and GIL release during computation.
My question is general, but here is a concrete example from my EMA function binding:
use crate::indicators::ema::core_ema;
use numpy::{PyArray1, PyArrayMethods};
use pyo3::pyfunction;
#[pyfunction(signature = (data, window_size = 14, alpha = None))]
pub(crate) fn ema<'py>(
py: pyo3::Python<'py>,
data: numpy::PyReadonlyArray1<'py, f64>,
window_size: usize,
alpha: Option<f64>,
) -> pyo3::PyResult<pyo3::Py<numpy::PyArray1<f64>>> {
let slice = data.as_slice()?;
let py_array_out = PyArray1::<f64>::zeros(py, [len], false);
let py_array_ptr = unsafe { py_array_out.as_slice_mut()? };
py.allow_threads(|| core_ema(slice, window_size, alpha.into(), py_array_ptr))
.map_err(|e| pyo3::exceptions::PyValueError::new_err(format!("{:?}", e)))?;
Ok(py_array_out.into())
}I want to release the GIL to enable multithreading. I wonder how to do it safely. Or if it's possible.
The documentation of numpy::PyReadonlyArray says:
An instance of this type ensures that there are no instances of PyReadwriteArray, i.e. that only shared references into the interior of the array can be created safely
Additionnally, in the numpy module borrow documentation, it says:
The aim of this module is to ensure that safe Rust code is unable to violate these requirements on its own. We cannot prevent unchecked code - this includes unsafe Rust, Python or other native code like C or Fortran - from violating them. Therefore the responsibility to avoid this lies with the author of that code instead of the compiler
So, in my understanding, I get a reference to input data which is owned by Python and can, in theory, be changed from the outside (e.g. by another Python thread). So in my understanding, the answer should be 'no' it is not sound. But I'm not sure if there is a way, or a workaround or a good practice or if I just need to avoid using allow_thread with PyReadonlyArray1.
Thanks a lot for your time.
As per the documentation you quoted, the general perception here is that properly written C code should also be respecting the readonly flag, and therefore it's ok to assume that a read-only array will not have the data change. But we cannot categorically guarantee this.
I believe it's a common pattern that numpy itself releases the gil when doing similar operations and so doing the same from Rust code should be ok in practice.