smarr/ReBench

Making environment handling more user-friendly

smarr opened this issue · 3 comments

smarr commented

Since #174, we are very strict in environment variables.

As per c5a2157 which is this line: https://github.com/smarr/ReBench/blob/master/rebench/model/exp_run_details.py#L59 the environment is per default empty.

In practice this means, it is completely empty.
There's no PATH, no nothing.

This means, the configuration file needs to contain everything, including the environment configuration.

This is in many ways unexpected and tedious, and causes a lot of unexpected debugging.

The big question for me is, how should we resolve this.

I see the following possible approaches:

  1. add a new flag to always use the system's environment
  2. use the -D flag that disables denoise to also disable the restricting of the environment

Other improvements:

  • make ReBench errors more helpful, currently the environment is not printed but only the working directory and the command
  • explicitly point out that the PATH might be missing or incomplete if a command fails with an error that indicates a missing binary
  1. add a new flag to always use the system's environment

So long as it's a flag and not the default, sure, that would definitely see some use. Security isn't a major concern for the average user, I think it's fine to have an option to circumvent the environment removal.

  1. use the -D flag that disables denoise to also disable the restricting of the environment

Dunno. Bit unintuitive... And I'm not too keen on mixing denoise with the environment in such a way, when really you may want to disable one and not the other.

make ReBench errors more helpful, currently the environment is not printed but only the working directory and the command

+1 for printing the environment (at least with the debug flag)

explicitly point out that the PATH might be missing or incomplete if a command fails with an error that indicates a missing binary

Definitely that one, although it would only work for missing binaries and not so easily for more obscure errors that can be caused by the unexpected lack of an env. Still, I like it.

Alternatively, would there be any value in running experiments by default with no environment EXCEPT a default PATH? I.e. /bin:/usr/bin:/usr/sbin:/usr/local/bin . All directories where you need to have extra permissions to create files in. It's not as secure as there being no PATH like it currently is, but if we're assuming there's already harmful binaries on the machine that were put in those directories, you've already got a big problem.

Maybe my thinking is backwards, but I would think it's an acceptable compromise. Hell, running sudo bash has a default PATH in its env, it would be odd otherwise.

Definitely needs to explicitly mention "Running with default PATH "the default path"" if you go for that.

smarr commented

I am not too concerned about security really. We throw security out of the window with denoise in many ways already.

So, it's mostly about reproducibility.

And yes, a default PATH that's the same everywhere seems sensible.
The one question then is how to treat the env: {PATH: ''} setting, would think we still need to replace the PATH. Not sure I am prepared to implement env: {PATH: '/foo/:$PATH'} and variants though...

Oh wow, I always assumed it was security reasons since that's usually why people tiptoe around the value of PATH. Makes sense that it's reproducibility in this context.

Not sure I follow the setting bit, but shouldn't Rebench assume that any provided env variable with no value ought not to be set at all? If env: {PATH: ''} is provided, then shouldn't its behaviour be to unset PATH if it was in the env, or do nothing otherwise?

Not sure I am prepared to implement env: {PATH: '/foo/:$PATH'} and variants though...

I'm really not sure that feature is needed. Kinda goes against your idea of reproducibility since it opens you up to many a weird bug ("oh look I thought my binary was in /foo/ but turns out it wasn't, so instead of rebench reporting that I messed up, it's been using the wrong version of my binary this whole time instead of failing")

The only benefit I see is that we assume there's a default PATH, then you can extend the default PATH slightly more easily by typing /foo/:$PATH instead of /foo/:/bin:/usr/bin:/usr/sbin:/usr/local/bin (which isn't crazy value)