pypa/wheel

arm64-only package selects `universal2` as the arch tag on macOS 11

mathstuf opened this issue · 23 comments

I have a wheel I'm trying to build with C++ code in it. It contains only arm64 binaries for the non-Python code, so universal2 is the wrong arch tag to use. Is there a way to suppress this and instead use arm64?

This code seems to be poorly structured and is probably the root cause. universal2 is arm64 and x86_64, not "not x86_64" as this code currently does. _mac_binary_formats looks to make a similar assumption.

jcfr commented

@henryiii When you have a chance, I would be curious to get your perspective on this ?

@mathstuf,

To my knowledge, the code you refer to in packaging is the one used to know which platforms tag are acceptable to be installed (i.e. the other way around).

is your CPython a universal2 one ?
If so, the default platform tag will always be universal2 when building a platform specific wheel.
You can export the following environment variable to get an arm64 wheel when building from a universal2 CPython:

export _PYTHON_HOST_PLATFORM="macosx-11.0-arm64"
export ARCHFLAGS="-arch arm64"

Edit: I guess you're not using a "raw" setuptools build system which would have build with ARCHFLAGS="-arch x86_64 -arch arm64" ?

Thanks for getting to this, @mayeut! Still working though a messaging backlog.

Maybe you are using Scikit-Build? I don't think we've worked through the details there yet, ideally it should respect the settings that control setuptools, but for now, maybe the info in pypa/cibuildwheel#582 is helpful?

scikit-build with --plat-name successfully builds macosx-11.0-arm64 wheels for itk.

is your CPython a universal2 one ?

Yes.

If so, the default platform tag will always be universal2 when building a platform specific wheel.

Hrm, that's unfortunate.

You can export the following environment variable to get an arm64 wheel when building from a universal2 CPython:

Is this documented anywhere? I assume _PYTHON_HOST_PLATFORM is all that matters then.

I guess you're not using a "raw" setuptools build system which would have build with ARCHFLAGS="-arch x86_64 -arch arm64" ?

This project (VTK) isn't doing universal builds yet (I mean it could, but the per-wheel size would be quite large by that point).

Maybe you are using Scikit-Build? I don't think we've worked through the details there yet, ideally it should respect the settings that control setuptools, but for now, maybe the info in pypa/cibuildwheel#582 is helpful?

No, it's a CMake build which generates the setup.py that is used to make the wheel (driving the build from setup.py is not maintainable because what gets made depends on the CMake flags being passed in, but only the CMake code really knows this). Basically, the project needs to be configured before setup() can be called, then some information extracted from the configure and then setup can be called. I find that it's just easier to generate the setup.py through the CMake which already knows all the relevant information.

scikit-build with --plat-name successfully builds macosx-11.0-arm64 wheels for itk.

I'd really rather override just the architecture and not have to guess at what the rest of the platform name would be otherwise.

This has come up often enough that I've started implementing #161. The question is, should allow any arbitrary tags or should they be checked against the list of supported tags?

This could land in the next release but I need some kind of feedback!

The question is, should allow any arbitrary tags or should they be checked against the list of supported tags?

@brettcannon or @pfmoore might have opinions here? Hard for me to say without seeing the interface - I'm mostly interested in building wheels with the standard tools (build, for example), so I don't have the option to pass custom flags through to build via setup.py. I'm more interested in #407, which would allow "special" cases to be manually handled. (And I'll work on implementing that in roughly a week or so, my family was sick last week so behind currently).

Scikit-build 0.12 should fully support targeting different platforms, and the competitor example, pybind/cmake_example, should be updated to do so, too.

In that case I will defer my work on #161 and look into implementing #407 instead.

@brettcannon or @pfmoore might have opinions here?

Sorry, I have no opinion here. I'm not a Mac user, personally.

In that case I will defer my work on #161 and look into implementing #407 instead.

For implementing #407, be sure to look at https://github.com/scikit-build/cmake-python-distributions/blob/master/scripts/convert_to_generic_platform_wheel.py and/or https://github.com/scikit-build/ninja-python-distributions/blob/master/scripts/convert_to_generic_platform_wheel.py (which should be basically the same). The interface is different, but the core work is likely similar. If you are going to work on it, let me know - I'll try to ping you when I start working on it, as well - it would be good to have a basic mutex here. :)

Is there a way to detect that the compiled files are ARM64 only? All this mac specific stuff was contributed by others so I'm pretty much in the dark here myself.

Generally, ARCHFLAGS should be set correctly, and if so, I think the wheels come out with the correct names? If you are using CMake, you should either use scikit-build 0.12, which does this correctly, or use something like this if doing it by hand: https://github.com/pybind/cmake_example/blob/f6539807e764707f2ee2b9d7ba0c8d017d685853/setup.py#L93-L97

The same tooling that used to check for FAT binaries should work for the new arch, too, see https://stackoverflow.com/questions/1085137/how-do-i-determine-the-target-architecture-of-static-library-a-on-mac-os-x

My case is that I have libraries built completely outside of setup.py's view because the build system is too complicated to have setuptools drive the build (because it wants to know what packages/modules will exist before the build starts). Instead, the build generates the setup.py and tells it what modules will exist once the build completes. We also do not build universal binaries because the wheels are big enough as is and a universal build would just double it.

In addition (though not in my case), some bits may be x86_64-only as some modules cannot be built for arm64 yet (e.g., due to the lack of a Fortran compiler meaning numpy and scipy are not yet available).

Basically, I've been able to hack it up using _PYTHON_HOST_PLATFORM which I'm not a fan of since it means I have to sync up more than just the arch override there (namely syncing the minimum macOS target version).

This has come up often enough that I've started implementing #161. The question is, should allow any arbitrary tags or should they be checked against the list of supported tags?

I think it depends on the user experience you want. If you want to prevent potential typos, then check. But obviously complete flexibility regardless of what the tools we can all think of expect says let people type whatever.

I would personally check and if it turns out not to be flexible enough then stop doing that.

My case is that I have libraries built completely outside of setup.py's view because the build system is too complicated

This is exactly the sort of thing #407 would be great for, I think.

I'm probably partially off tomorrow then mentoring for a workshop the rest of the week, so I can probably start on #407 next week unless @agronholm has already started.

I haven't started yet; I'm busy with another project currently.

Have to do cross-compilation, hope to deliver by the morning.

macos12 python 3.11 x86_64-only wheel is incorrectly marked as universal2:
https://github.com/auxten/chdb/actions/runs/4606253723/jobs/8139321261

As we can see from the uploaded pypi wheel, the whl only contains an x86_64 so file,
https://files.pythonhosted.org/packages/0c/81/ff3a4ffd2c1efb5fd23d0d7bfd70918db961dd8f0832cd1b47ffb97428d0/chdb-0.3.0-cp311-cp311-macosx_10_15_universal2.whl

and even after set

        setup(
            options={
                'bdist_wheel': {'universal': False},
            },
        )

also didn't work
https://github.com/auxten/chdb/blob/2ba882caa73f4175e9557476b8637b1f471b505f/setup.py#L158

"universal" in bdist_wheel refers to pure Python wheels that support Python 2 and Python 3 at the same time, it's a historical option that has nothing to do with universal2, Apple's way of shipping x86 & arm binaries.

I'd recommend looking at https://github.com/pybind/python_example - you based this on a very old version of this example (including the incorrect statement that Python 3.6 has has_flag, that was never added), long before it supported universal2, etc. I'm wary of the manual get_python_ext_suffix and custom handling you are doing - you are probably grabbing the extension suffix from Python, which is a universal build.

I'd also use setuptools_scm instead of rolling your own git tag system.

I'd also recommend https://github.com/pypa/cibuildwheel if you want to build redistributable wheels, you'll be stuck with CPython's compiled macOS version, for example, if you roll wheel building yourself.

You can retag yourself as a quick fix using the new 0.40 wheel tags command.

thanks @henryiii.
Because chdb is trying to combine clickhouse and pybind, the get_python_ext_suffix is just used to let clickhouse compile system get the correct so file name which is .cpython-311-darwin.so, and it works correctly.
The has_flag and cpp_flag is not used in the setup.py.
I'm using cibuildwheel on macos11. but cibuildwheel caused clickhouse build issue on macos12 and py311, so I dropped it.

It seems the grpc project is also facing similar issue, they have a dirty fix here: https://github.com/grpc/grpc/blob/2845a248d6e1f617bbe0ffcd851cb1818725cda3/tools/run_tests/artifacts/build_artifact_python.sh#L160
and didn't fix the arch info in dist-info/WHEEL

wheel tags --platform-tag=macosx_11_0_arm64 --remove <old_wheel_name>

is a "correct" fix that handles the dist-info. It's a bit of a hack; I think it's because you aren't handling the compiling correctly via setuptools, but it's a correct hack.