bazelbuild/rules_python

pypi libraries installed to system python are implicitly available to builds

alecbz opened this issue ยท 19 comments

It looks like any libraries installed to my system python's site-packages are available to bazel, whether or not they are expressed as a dependency to bazel.

This seems like an issue from a reproducible build standpoint. I.e.: it'd be easy to forget to include an external library in bazel, and create implicit dependencies for others using your BUILD files or for any par files or similar archives you build.

I believe this is a problem with the Bazel core rules, and maybe google/subpar too. @damienmg @duggelz

It is a consequence of us trying to make it easy for people to use Bazel on their system. Specifying --experimental_use_strict_env should do the trick.

Hmm, after watching this last night, I'm not so sure it is that simple (unless you guys reimplemented virtualenv?).

@damienmg Can you move this to the main repo or assign this to the appropriate Bazel owner?

@damienmg --experimental_use_strict_env doesn't seem to exist AFAICT? My bazel version is:

$ bazel version
Build label: 0.6.1-homebrew
Build target: bazel-out/darwin_x86_64-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Oct 6 02:36:58 2017 (1507257418)
Build timestamp: 1507257418
Build timestamp as int: 1507257418

Is it a new flag? I also don't see it in the source though? https://github.com/bazelbuild/bazel/search?utf8=%E2%9C%93&q=experimental_use_strict_env&type= (unless there's a different repo where flags get defined?)

I believe he may have meant "--experimental_strict_action_env" rather than "--experimental_use_strict_env"?

It is listed in the CLI reference.

drigz commented

With 0.7.0, --experimental_strict_action_env doesn't work for me, but activating an empty virtualenv before bazel run does:

> echo > WORKSPACE
> cat BUILD.bazel 
py_binary(
    name = "main",
    srcs = ["main.py"],
)
> cat main.py 
import google
print google.__path__
> bazel run :main
[SNIP: build output]
['/usr/local/lib/python2.7/dist-packages/google']
> bazel run --experimental_strict_action_env :main
[SNIP: build output]
['/usr/local/lib/python2.7/dist-packages/google']
> virtualenv -p python2 empty_env
> source empty_env/bin/activate
(empty_env) > bazel run :main
[SNIP: build output]
Traceback (most recent call last):
  File "/usr/local/google/home/rodrigoq/.cache/bazel/_bazel_rodrigoq/bca9229df5db49ddbafab09d12924513/execroot/__main__/bazel-out/local-fastbuild/bin/main.runfiles/__main__/main.py", line 1, in <module>
    import google
  File "/usr/local/buildtools/current/sitecustomize/sitecustomize.py", line 152, in SetupPathsAndImport
    return real_import(name, globals, locals, fromlist, level)
ImportError: No module named google
ERROR: Non-zero return code '1' from command: Process exited with status 1
evanj commented

My recollection is that PEX removes site-packages from sys.path in an attempt to avoid this problem. In theory, it only uses the "standard library" from the Python interpreter, and assumes the PEX zip contains all other required Python libraries. See minimum_sys_path for the weirdness it does: https://github.com/pantsbuild/pex/blob/39bcfaaa1e64c7edd57c8514c8086e624dbd1c56/pex/pex.py#L191

I suspect subpar could probably include a similar mechanism which would probably resolve this.

As a longer-term fix, these rules could start creating virtualenv trees, instead of or alongside the runfiles trees, which isolate a Python interpreter and its standard library from other third-party packages.

Would it be any easier to check at build-time for any undeclared dependencies, instead of modifying the artifact to not have access to them at run-time?

I think pants does the latter via the lint goal, by creating a virtualenv and importing a file to in turn run all of that file's imports (though any non-file-level imports would get missed I imagine), in addition to pexes not having access to undeclared dependencies per @evanj's comment.

@duggelz @AlecBenzer it looks like this is possible using Python toolchains, see https://gist.github.com/NathanHowell/5cf4a353a8dd3a1025e682c4707d5bac for an example

@pstradomski yeah I can do that, I have a couple other things I want to add to the poetry rules too so maybe I'll drop it in that repo later this week

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

Don't close. This is still a problem when using the non-hermetic default Python toolchain. When toolchain support is better supported here was can document the problem with the default toolchain, direct users at a sane way create a hermetic toolchain, and then close this issue.

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

It is possible to avoid loading site-packages using python -S but we would have to upstream a change like that to the bazelbuild python stub script.

What is the status on this issue?

This is very stale. Closing.