scikit-build/scikit-build-core

Allow "wheel.packages" to take a dictionary instead of an array?

bennyrowland opened this issue · 6 comments

My company has a collection of Python packages that are distributed as namespace packages under a single master package name (for reasons, don't ask). Some of those packages are mainly C++ with some Python bindings and I would like to use a source layout that looks something like:

src/
    python/
        python_code.py
        python_bindings.cpp
        CMakeLists.txt
    cpp_code.cpp
    CMakeLists.txt

I can use the current wheel.packages option but then I have to add in folders for the master package and the package name inside the python folder (src/python/$COMPANY_NAME/$PACKAGE_NAME/) which feels like a lot of folders for no very good reason. If we allowed wheel.packages to be a dictionary, then presumably I could specify something like:

[tool.scikit-build.wheel.packages]
$COMPANY_NAME.$PACKAGE_NAME = "src/python"

or inline as

[tool.scikit-build]
wheel.packages = {$COMPANY_NAME.$PACKAGE_NAME = "src/python"}

I haven't really looked at how this part of the code is implemented, so I don't know how hard it would be to make this change. It falls under a "nice to have" rather than "really important" label for me, so I would have a go if it is mainly a matter of adapting the input from the toml file into the internal representation, but if it requires a wholesale rewriting of the internal path machinery then probably not worth it. @henryiii, perhaps you could comment on that?

I don't quite follow how this doesn't work with having wheel.packages be a list. Can you clarify how the site_packages folder is meant to be after the install, and where the initial python/binding files are in the source layout? Could it be that you want wheel.packages to use globbing like src/python/*/*/ (also if $PACKAGE_NAME refers to python package, you can stop 1 layer below it)

What I want is for my repo to have files like src/python/__init__.py which get installed into site-packages/my_company/my_package/__init__.py, so that I don't have to make the path in my repo be src/python/my_company/my_package/__init__.py (which is how I understand the current version to work).

Ohh, now I understand. Quite unintuitive to navigate, and I am not sure if tools like pytest-cov can handle such structure, but in principle it should be doable. Are there any other backends that support such format that we can reference against? I think hatchling does not support it.

This is supported by setuptools (https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#package-discovery-and-namespace-packages) and coverage.py already supports rewriting paths to map between repo copies of files and installed copies (for example for tox/nox environments).

It is not how I would advocate doing things for a traditional Python package, but as I say, in this case where I have what is mainly a C++ repo with some Python bits hanging off the side, I really want the src/python folders to keep the Python component separate and well organised, but adding in the extra folders to replicate the whole package structure feels a bit inelegant, and means longer paths, more typing/clicking to navigate around etc. Not a big deal, but irritating enough that it felt worth raising this issue :-). Note that I am not proposing rearranging anything internally (which would make it difficult to navigate), these two folders (company_name and package_name) would only ever have one subfolder in them, so just chopping them out of the hierarchy shouldn't be too confusing.

I would not use the inline form, as TOML 1.0 doesn't allow multiline inline tables.

I think it's doable, though there are several things that probably need care, like making sure editable installs map correctly. Before jumping in, though, I'd like to know what hatchling supports here - if you could do it via force-include, for example, maybe supporting that would make sense.

Hatchling supports this via force-include, so I think we should add that before considering this.