psf/pyperf

Allow setting updated/new env var *values* on Runner.

ericsnowcurrently opened this issue · 2 comments

In _utils.py the create_environ() helper produces a dict that can be passed as the "env" arg to subprocess.run(), etc. It is used in Master.spawn_worker() (in _master.py). Currently it takes an "inherit_environ" arg (Bool) that corresponds to the "--inherit-environ" flag in the Runner CLI (see Runner.init() in _runner.py). This results in a potentially problematic situation.

The Problem

Let's say you have a benchmark script that internally relies on some environment variable that is defined relative to the commandline args given to the script. This environment variable may be set already or it might not. Regardless, you will be setting it to some new value. To make this work you need to do something like the following:

    # This is a concrete example.
    os.environ["MY_VAR"] = "spam" if runner.args.my_flag else "eggs"
    if runner.args.inherit_environ is None:
        runner.args.inherit_environ = ["MY_VAR"]
    else:
        runner.args.inherit_environ.append("MY_VAR")

However, in some cases you can't leave the env var set (or maybe the env var could cause pyperf to break). Plus things are more complicated if you have more than one such env var.

The Solution

Consequently, in a benchmark script it would be nice to be able to give the actual env var pairs to Runner rather than doing the dance above. Here are some possible approaches to solve the problem:

  • allow Runner.args.inherit_environ to be a dict
    • create_environ() would be updated to do the right thing
    • this is probably too messy to be part of pyperf's public API
  • add something like Runner.env_vars to allow benchmarks to explicitly set env var values to be used in workers
    • create_environ() would grow a new "env" arg or "inherit_environ" would updated as above
  • add Runner.add_env_var(name, value=INHERIT)
    • this is like either of the two above, but probably a better public API for the functionality

FWIW, I ended up using a small helper as a stopgap. (I had to set a different PYTHONPATH value, along with others.) It looks something like this:

def bench_command_env(runner, name, command, env):
    if runner.args.inherit_environ:
        runner.args.inherit_environ.extend(env)
    else:
        runner.args.inherit_environ = list(env)

    before = dict(os.environ)
    os.environ.update(env)
    # At this point any use of env vars defined in "env" should be considered tainted.
    try:
        return runner.bench_command(name, command)
    finally:
        os.environ.clear()
        os.environ.extend(before)

@ericsnowcurrently cc @vstinner
Thank you, Eric!
I recently added a new feature --copy-env which will copy all os.environ.
(Not released yet, plan to release at this week ;)) https://github.com/psf/pyperf/releases/tag/2.2.0
61555e8

I know this feature may not cover all your cases, but please take a look at the new feature.
And also I think that your suggestion is also reasonable, I want to listen to other opinions from @vstinner and @pablogsal