bazelbuild/rules_python

"Cross compilation" of py_binary/py_image/py_library targets

nimish opened this issue ยท 22 comments

Hi,

I have a py_binary that depends on a python pip library (grpcio) that has a native extension bundled in. This means that to create a linux container i'd need to have the pip_import rule download the manylinux wheel, not the host one (macos in my case).

Is there a way to force this? Otherwise py_image will happily just bundle up wheels with darwin native libs. py_binary will also only make host-runnable things.

ali5h commented

use https://github.com/ali5h/rules_pip/ and use pip_install(["--platform=linux_x86_64"])

@ali5h this works (thanks!) but is there integration into the bazel platform selection functions? I don't want separate targets for Linux and Mac.

E.g. Something that works with https://docs.bazel.build/versions/master/platforms.html

ali5h commented

you can define multiple piplib repos for different platforms and use select to pickup correct one

Is that documented and supported anywhere? That would be the ideal case, for bazel to automatically pick up the right pip repo for the right target platform.

E: should it not just be built in to the py_binary/py_library rule, to select the right target platform libs automatically?

Would really, really like to see this as well.

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

Help

Seems #510 and #531 and addressing the same issue? I'd love to have that functionality so excited to so folks interested enough to open PRs ๐Ÿ˜„

Hopefully @alexeagle and @meastham can combine forces to get the awesome functionality over the line! ๐Ÿš€

Friendly ping to @alexeagle and @meastham, are either of you able to take another look at your pull requests? I'd love to have that functionality ๐Ÿ˜ƒ

I wonder if pipenv could be used to facilitate generating cross platform dependency graphs. Does anyone have experience with the tool?

Friendly ping to @alexeagle and @meastham, are either of you able to take another look at your pull requests? I'd love to have that functionality ๐Ÿ˜ƒ

Sorry for the long delay on this! I did some investigation on some of the open questions, I'll see if we can find a path to getting this merged.

I don't think the PRs are exactly addressing the same issue. #531 allows having different requirements files for different platforms, but doesn't not allow downloading wheels for a different platform than the host platform. #510 allows downloading wheels for arbitrary platforms ("cross compiling"), but requires every platform to share one requirements file which has potential problems. Conceptually they could be compatible but the implementations would need to be modified to harmonize the way platform selection is done.

I'm all for the wheel-only approach taken in #510. I don't want native compilation happening at all in the loading phase, as it can lead to hard-to-diagnose cache misses between machines (in my experience).

Thinking out loud here. If adding a wheel-only option, can we remove pip from the build-time process entirely? Instead of using pip download, why not just use bazel's built-in http tools to download the wheel? Instead of using pip-compile to generate a locked requirements.txt, write something that generates a .bzl file with compiled dependencies in a more bazel-centric fashion. Maybe resolvers from poetry or pipenv are used for this, which apparently support multi-platform resolve. Dependencies between libraries - including platform specific - could be explicit in the generated file and defined using bazel's own select or whatever.

I'm sure there are gotchas here (and significant work), but maybe detaching from pip could open up new avenues.

I think it should also be possible to compile python/C++ sources into wheels, however those need to happen in actions so they are debuggable and so that the target platform can be used.

Ping on this @alexeagle and @meastham. #510 is exactly what I need today and having #531 would be a strong nice to have. Any progress here? Thanks!

Ping on this @alexeagle and @meastham. #510 is exactly what I need today and having #531 would be a strong nice to have. Any progress here? Thanks!

Hey @gopher-maker,

I'm just blocked on getting some guidance from a repository owner on how to resolve the outstanding issues with #510 (not a complaint btw, I'm sure everybody is quite busy!).

FWIW we've been using it for a fairly large Python codebase without significant problems for about 9 months now, so if you're feeling adventurous you can use it already. It looks like it now needs some non-trivial rebasing work; I'll see if I can get to that this week.

Any leads on this yet? I am running into a similar issue trying to use Mujoco in a bazel workspace.

@f0rmiga has been working on something related to this at a client, I don't have any update sorry.

There's a current effort that @philsc is writing in a doc, and has the collaboration from @jvolkman. I wrote a resolver to download Python packages using http_file. It's similar to how Gazelle does it for Go third-party deps. I'll start a draft PR in the coming days to have it maintained in rules_python. It will change quite a bit the current workflow to work with wheels in Bazel, and I don't have all the answers yet, so I don't foresee it being in a release soon.

Ah, thanks for the answers folks! I will try to find some other workaround.

pvcnt commented

@f0rmiga Could you please post the link to the PR when it's open in this thread? I'm very much interested in this also!

I am looking to fix this for bzlmod in #1643 at least for the whl downloading part, I'll leave this ticket to track the sdist cross-compilation.

Remaining things to stabilize this feature for bzlmod:

  • Fix #1996.
  • Instead of sha, use the sanitized full wheel name. Potentially lift some code from rules_pycross.
  • Use hub_name as the prefix for the whl_library repos. This means that we reuse the same whl_library for multiple python versions if that is configured.
  • Graduate the following flags from experimental:
    • experimental_index_url -> index_url and default it to https://pypi.org/simple. Users could still set it to empty string to fallback to the old behaviour.
    • experimental_index_url_overrides. This potentially should go to the pip.override tag class?
    • experimental_extra_index_urls
    • experimental_target_platforms. This should potentially be dropped from the public API and only the whl_library should have it.
  • Move around code doing all of the plumbing for PyPI to //python/private/pypi so that CODEOWNERS could be easily set.

Out of scope:

  • With #260 done, #1708 could be probably done, but I'll keep it out of scope of #260.
  • Supporting pdm, uv, poetry lock files. Whilst the output of parse_requirements function is generic enough to support all of these package managers, supporting them is out of scope.