pytest-dev/py

numpy save raises UnicodeDecodeError when file is a `py._path.local.LocalPath`

Embeddave opened this issue · 3 comments

Hi maintainers, I'm using pytest's tmpdir_factory to create a bunch of paths to mocked numpy .npy files (just random number arrays) that I then return from a fixture.

But when I call np.save to save the arrays to the paths built with a directory from tmpdir_factory, I'm getting a UnicodeDecodeError:

*** UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte

Turns out this is because my npy_path is a py._path.local.LocalPath instance.

My fixture looks something like this:

from pathlib import Path
import tempfile

import numpy as np
import pytest

rng = np.random.default_rng()

@pytest.fixture()
def fake_arr_path_factory(tmpdir_factory):

    def _fake_arr_path_factory(
                                  size=(32, 32, 1),
                                  n_paths=10):
        tmpdir = tmpdir_factory.mktemp(basename="fprints")

        npy_paths = []
        for _ in range(n_paths):
            fake_arr = rng.standard_normal(size=size)
            with tempfile.NamedTemporaryFile() as temp:
                # abusing NamedTemporaryFile to generate random file names
                name = Path(temp.name).name
            npy_path = tmpdir / name
            np.save(npy_path, fake_arr)
            npy_paths.append(npy_path)
        return npy_paths

    return _fake_arr_path_factory

so, when the returned factory function calls np.save I get that error:

np.save(npy_path, fake_arr)
*** UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte

A simple workaround is just to cast to str.

np.save(str(npy_path), fake_arr)

Not sure how concerned you all are with this but I didn't find related issues so I thought I'd report it, in case anyone else runs into the same thing.

I get a related error when calling np.load with a LocalPath as well:

E       TypeError: open() argument 2 must be str, not int

Please migrate to tmppath-factory, pytest is working on removing local path usage so we can deprecate it to eventually phase it out

I see, thank you for clarifying.
Will do.
Going ahead and closing this