Some tests fail when using the release tarball

Question

Some tests fail when using the release tarball

iyanmv opened this issue 10 months ago · 5 comments

Summary:

Some tests fail when using the release tarball.

Description:

In order to create a package for a distribution, I download the latest release tarball, build it (or in this case create a wheel package file cmdstanpy-1.2.0-py3-none-any.whl) and then run the tests on an isolated environment that only installs that wheel tarball together with the required dependencies (cmdstan, numpy, pandas, tqdm and stanio).

I'm running the tests with:

PYTHONPATH="test_dir/$_site_packages:$PYTHONPATH" CMDSTAN="/opt/cmdstan" pytest test

And this are the results:

==> Starting check()...
============================= test session starts ==============================
platform linux -- Python 3.11.6, pytest-7.4.4, pluggy-1.3.0
rootdir: /build/python-cmdstanpy/src/cmdstanpy-1.2.0
collected 280 items

test/test_cmdstan_args.py ..................................             [ 12%]
test/test_compiler_opts.py ............                                  [ 16%]
test/test_compliance.py ......                                           [ 18%]
test/test_cxx_installation.py s.ss                                       [ 20%]
test/test_generate_quantities.py .F..F.F..F.F..F...F..F.                 [ 28%]
test/test_install_cmdstan.py ....                                        [ 29%]
test/test_laplace.py ......                                              [ 31%]
test/test_log_prob.py ....                                               [ 33%]
test/test_metadata.py .                                                  [ 33%]
test/test_model.py ............F...............s..                       [ 44%]
test/test_optimize.py ......................                             [ 52%]
test/test_pathfinder.py .....                                            [ 54%]
test/test_runset.py .......                                              [ 56%]
test/test_sample.py ....F.....F.........F........................F.F.F.. [ 75%]
....F                                                                    [ 77%]
test/test_utils.py ..............................sss.................    [ 95%]
test/test_variational.py ..............E                                 [100%]

==================================== ERRORS ====================================

Some tests fail to run because they assume that git is installed and that the directory is a git repo in order to delete compiled models and output results (see cleanup_test_files() from module conftest.py).

___________________ ERROR at teardown of test_serialization ____________________

    @pytest.fixture(scope='session', autouse=True)
    def cleanup_test_files():
        """Remove compiled models and output files after test run."""
        yield
>       subprocess.Popen(
            ['git', 'clean', '-fX', DATAFILES_PATH],
            stdout=subprocess.DEVNULL,
            stderr=subprocess.STDOUT,
        )

test/conftest.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3.11/subprocess.py:1026: in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <Popen: returncode: 255 args: ['git', 'clean', '-fX', '/build/python-cmdstan...>
args = ['git', 'clean', '-fX', '/build/python-cmdstanpy/src/cmdstanpy-1.2.0/test/data']
executable = b'git', preexec_fn = None, close_fds = True, pass_fds = ()
cwd = None, env = None, startupinfo = None, creationflags = 0, shell = False
p2cread = -1, p2cwrite = -1, c2pread = -1, c2pwrite = 25, errread = -1
errwrite = 25, restore_signals = True, gid = None, gids = None, uid = None
umask = -1, start_new_session = False, process_group = -1

    def _execute_child(self, args, executable, preexec_fn, close_fds,
                       pass_fds, cwd, env,
                       startupinfo, creationflags, shell,
                       p2cread, p2cwrite,
                       c2pread, c2pwrite,
                       errread, errwrite,
                       restore_signals,
                       gid, gids, uid, umask,
                       start_new_session, process_group):
        """Execute program (POSIX version)"""
    
        if isinstance(args, (str, bytes)):
            args = [args]
        elif isinstance(args, os.PathLike):
            if shell:
                raise TypeError('path-like args is not allowed when '
                                'shell is true')
            args = [args]
        else:
            args = list(args)
    
        if shell:
            # On Android the default shell is at '/system/bin/sh'.
            unix_shell = ('/system/bin/sh' if
                      hasattr(sys, 'getandroidapilevel') else '/bin/sh')
            args = [unix_shell, "-c"] + args
            if executable:
                args[0] = executable
    
        if executable is None:
            executable = args[0]
    
        sys.audit("subprocess.Popen", executable, args, cwd, env)
    
        if (_USE_POSIX_SPAWN
                and os.path.dirname(executable)
                and preexec_fn is None
                and not close_fds
                and not pass_fds
                and cwd is None
                and (p2cread == -1 or p2cread > 2)
                and (c2pwrite == -1 or c2pwrite > 2)
                and (errwrite == -1 or errwrite > 2)
                and not start_new_session
                and process_group == -1
                and gid is None
                and gids is None
                and uid is None
                and umask < 0):
            self._posix_spawn(args, executable, env, restore_signals,
                              p2cread, p2cwrite,
                              c2pread, c2pwrite,
                              errread, errwrite)
            return
    
        orig_executable = executable
    
        # For transferring possible exec failure from child to parent.
        # Data format: "exception name:hex errno:description"
        # Pickle is not used; it is complex and involves memory allocation.
        errpipe_read, errpipe_write = os.pipe()
        # errpipe_write must not be in the standard io 0, 1, or 2 fd range.
        low_fds_to_close = []
        while errpipe_write < 3:
            low_fds_to_close.append(errpipe_write)
            errpipe_write = os.dup(errpipe_write)
        for low_fd in low_fds_to_close:
            os.close(low_fd)
        try:
            try:
                # We must avoid complex work that could involve
                # malloc or free in the child process to avoid
                # potential deadlocks, thus we do all this here.
                # and pass it to fork_exec()
    
                if env is not None:
                    env_list = []
                    for k, v in env.items():
                        k = os.fsencode(k)
                        if b'=' in k:
                            raise ValueError("illegal environment variable name")
                        env_list.append(k + b'=' + os.fsencode(v))
                else:
                    env_list = None  # Use execv instead of execve.
                executable = os.fsencode(executable)
                if os.path.dirname(executable):
                    executable_list = (executable,)
                else:
                    # This matches the behavior of os._execvpe().
                    executable_list = tuple(
                        os.path.join(os.fsencode(dir), executable)
                        for dir in os.get_exec_path(env))
                fds_to_keep = set(pass_fds)
                fds_to_keep.add(errpipe_write)
                self.pid = _fork_exec(
                        args, executable_list,
                        close_fds, tuple(sorted(map(int, fds_to_keep))),
                        cwd, env_list,
                        p2cread, p2cwrite, c2pread, c2pwrite,
                        errread, errwrite,
                        errpipe_read, errpipe_write,
                        restore_signals, start_new_session,
                        process_group, gid, gids, uid, umask,
                        preexec_fn, _USE_VFORK)
                self._child_created = True
            finally:
                # be sure the FD is closed no matter what
                os.close(errpipe_write)
    
            self._close_pipe_fds(p2cread, p2cwrite,
                                 c2pread, c2pwrite,
                                 errread, errwrite)
    
            # Wait for exec to fail or succeed; possibly raising an
            # exception (limited in size)
            errpipe_data = bytearray()
            while True:
                part = os.read(errpipe_read, 50000)
                errpipe_data += part
                if not part or len(errpipe_data) > 50000:
                    break
        finally:
            # be sure the FD is closed no matter what
            os.close(errpipe_read)
    
        if errpipe_data:
            try:
                pid, sts = os.waitpid(self.pid, 0)
                if pid == self.pid:
                    self._handle_exitstatus(sts)
                else:
                    self.returncode = sys.maxsize
            except ChildProcessError:
                pass
    
            try:
                exception_name, hex_errno, err_msg = (
                        errpipe_data.split(b':', 2))
                # The encoding here should match the encoding
                # written in by the subprocess implementations
                # like _posixsubprocess
                err_msg = err_msg.decode()
            except ValueError:
                exception_name = b'SubprocessError'
                hex_errno = b'0'
                err_msg = 'Bad exception data from child: {!r}'.format(
                              bytes(errpipe_data))
            child_exception_type = getattr(
                    builtins, exception_name.decode('ascii'),
                    SubprocessError)
            if issubclass(child_exception_type, OSError) and hex_errno:
                errno_num = int(hex_errno, 16)
                child_exec_never_called = (err_msg == "noexec")
                if child_exec_never_called:
                    err_msg = ""
                    # The error must be from chdir(cwd).
                    err_filename = cwd
                else:
                    err_filename = orig_executable
                if errno_num != 0:
                    err_msg = os.strerror(errno_num)
>               raise child_exception_type(errno_num, err_msg, err_filename)
E               FileNotFoundError: [Errno 2] No such file or directory: 'git'

/usr/lib/python3.11/subprocess.py:1950: FileNotFoundError
----------------------------- Captured stderr call -----------------------------
13:25:54 - cmdstanpy - INFO - Chain [1] start processing
13:25:54 - cmdstanpy - INFO - Chain [1] done processing
------------------------------ Captured log call -------------------------------
DEBUG    cmdstanpy:compilation.py:415 found newer exe file, not recompiling
DEBUG    cmdstanpy:model.py:2041 idx 0
DEBUG    cmdstanpy:model.py:2042 running CmdStan, num_threads: 1
DEBUG    cmdstanpy:model.py:2054 CmdStan args: ['/build/python-cmdstanpy/src/cmdstanpy-1.2.0/test/data/variational/eta_should_be_big', 'random', 'seed=999999', 'output', 'file=/tmp/tmpuie7rvkx/eta_should_be_bign_77t1tj/eta_should_be_big-20240118132554.csv', 'method=variational', 'algorithm=meanfield', 'adapt', 'engaged=1']
INFO     cmdstanpy:model.py:2057 Chain [1] start processing
INFO     cmdstanpy:model.py:2114 Chain [1] done processing
=================================== FAILURES ===================================
_____________________________ test_pd_xr_agreement _____________________________

    def test_pd_xr_agreement():
        # fitted_params sample - list of filenames
        goodfiles_path = os.path.join(DATAFILES_PATH, 'runset-good', 'bern')
        csv_files = []
        for i in range(4):
            csv_files.append('{}-{}.csv'.format(goodfiles_path, i + 1))
    
        # gq_model
        stan = os.path.join(DATAFILES_PATH, 'bernoulli_ppc.stan')
        model = CmdStanModel(stan_file=stan)
        jdata = os.path.join(DATAFILES_PATH, 'bernoulli.data.json')
    
        bern_gqs = model.generate_quantities(data=jdata, previous_fit=csv_files)
    
        draws_pd = bern_gqs.draws_pd(inc_sample=True)
>       draws_xr = bern_gqs.draws_xr(inc_sample=True)

Others fail because they assume xarray is installed, when it's not marked as a test dependency (only optional).

To solve the second issue I will simply mark xarray as a dependency instead of optional dependency. For the first, I could write a patch but I first want to hear from you if it makes sense. My idea is to modify the function cleanup_test_files() to check if git is installed and if DATAFILES_PATH is a git repo. If both are true, then the current code can be used. Otherwise, nothing is done.

Additional Information:

For context, this is the (provisional) PKGBUILD that I wrote:

# Maintainer: Iyán Méndez Veiga <me (at) iyanmv (dot) com>
pkgname=python-cmdstanpy
_name=${pkgname#python-}
pkgver=1.2.0
pkgrel=1
pkgdesc="A lightweight interface to Stan"
arch=('any')
url="https://github.com/stan-dev/cmdstanpy"
license=('BSD-3-Clause')
depends=(
    'cmdstan'
    'python-numpy'
    'python-pandas'
    'python-tqdm'
    'python-stanio'
    'python-xarray'
)
makedepends=(
    'python-build'
    'python-installer'
    'python-wheel'
)
checkdepends=('python-pytest')
source=("${_name}-${pkgver}.tar.gz::https://github.com/stan-dev/${_name}/archive/refs/tags/v${pkgver}.tar.gz")
b2sums=('34cd1fff7e93b78f3395662040ed3a17697cbdf0f478fd4279ddb7b3704e9887d44d99040059009a6999c61d6ee234105f1404fcdef27a1a3847164d595c73bb')

build() {
    cd "${_name}-${pkgver}"
    python -m build --wheel --no-isolation
}

check() {
    local _site_packages=$(python -c "import site; print(site.getsitepackages()[0])")
    cd "${_name}-${pkgver}"
    python -m installer --destdir=test_dir dist/*.whl
    PYTHONPATH="test_dir/$_site_packages:$PYTHONPATH" CMDSTAN="/opt/cmdstan" pytest test
}

package() {
    cd "${_name}-${pkgver}"
    python -m installer --destdir="$pkgdir" dist/*.whl
    install -D -m644 LICENSE.md "${pkgdir}/usr/share/licenses/${pkgname}/LICENSE"
    # Delete useless python scripts
    rm -rf "${pkgdir}/usr/bin"
}

Current Version:

cmdstan 2.34.0 (with patch stan-dev/cmdstan#1239 included)
cmdstanpy 1.2.0

Answer 1 · 2024-01-18T12:54:47.000Z

Here I attach the logs from pytest (removing the ones caused by missing xarray)
test_errors.txt

Answer 2 · 2024-01-18T14:43:48.000Z

Hi @iyanmv -

The dependencies on both xarray and git in testing would be nice to make more clear. Your suggestions both sound good.

The other issues look like they are either the ones resolved by #723 (which will be released in CmdStanPy 1.2.1 after the CmdStan release dust settles, probably next week) or are caused by your CmdStan source files being in a location that does not have read/write access

Answer 3 · 2024-01-18T15:44:42.000Z

Indeed that patch fixes one test, but not the rest. I think you are right that most of them are permission issues.

To be honest, I don't know how to package this without making too many changes. Are you aware of any distro offering this in their repos? I think the simplest solution for now is to copy the CmdStan directory to a writable temp path just for the tests.

Answer 4 · 2024-01-18T15:49:24.000Z

I know very little about distro packaging, unfortunately. It is a restriction of CmdStan's current make-based build system that CmdStan needs to be able to write to its own directory, which I know makes several packaging standards unhappy. Your suggestion would probably fix the tests, but anyone actually trying to use CmdStanPy would get similar errors in their own code

Out of tree builds are one of the items on the wishlist for a new build system (probably based on CMake or something similar) but there's no timeline and not a ton of free effort to spare there, unfortunately. The make-based system works just barely enough that a replacement isn't mandatory, so it languishes on

Answer 5 · 2024-01-19T14:41:45.000Z

Since it sounds like all the failures are accounted for, I'm going to go ahead and close this issue. Sorry it wasn't a bit more satisfyingly resolved!