pypa/wheel

bdist_wheel makes absolute data_files relative to site-packages

agronholm opened this issue · 81 comments

Originally reported by: Marcus Smith (Bitbucket: qwcode, GitHub: qwcode)


bdist_wheel doesn't handle absolute paths in the "data_files" keyword like standard setuptools installs do, which honor it as absolute (which seems to match the examples in the distutils docs)

when using absolute paths, the data ends up in the packaged wheel at the top level, and get's installed relative to site-packages (along with the project packages)

so, bdist_wheel is re-interpreting distutil's "data_files" differently. maybe better for wheel to fail to build projects with absolute data_files, than to just reinterpret it.


Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


I.e. it's either that the wheel spec has to grow to cover absolute data_files (I don't see how it could handle them now; putting them into {distribution}-{version}.data doesn't help because that's relative to sys.prefix), or bdist_wheel just needs to fail to build in that case.

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, relative "data_files" paths are handled as expected and end up in the "*.data" dir in the packaged wheel.

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't think we should allow absolute paths.

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


Absolute paths need to be allowed but it may be acceptable to restrict to absolute paths within the sdist.

There's a place in setuptools where certain kinds of paths cause errors and I run into it from time to time. I don't remember the details atm, only that it would be much easier to use if it did allow absolute paths.

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


Why does it have to be allowed? If bdist_wheel and sdist were consistent, that would be one thing, but they're not and can't be at the current time, so it seems wrong for wheels to build absolute paths and then place them into site-packages

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


I could be thinking about setuptools' /other/ bug ;-)

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't see any reason why absolute paths have to be allowed. I think they are a bad design in general, everything should be rooted in sys.prefix. It's not a very good thing for a Wheel to be able to override /etc/hosts for instance.

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, there's a metadata issue open for whether wheel would grow the ability to handle platform-specific paths (including absolute I guess) https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to

for me, this issue isn't about that discussion.

it's about the oddity of placing absolute paths into site-packages

since wheel has no ability to properly place absolute files currently, it shouldn't build projects that declare them

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


packagename-1.0.data/data/ is currently a way to place absolute files. This is an accidental feature but I don't have any particular beef with it.

They are absolute relative to the root of the virtualenv :-) Or if no virtualenv is in use, probably /

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


That's not what absolute means, that's a relative path. An absolute file is one that will install to /this/exact/path/even/in/a/virtualenv

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


so take this setup.py which defines an absolute data files at "/opt/data_file": https://gist.github.com/qwcode/9144129
(and assuming there is a "data_file" relative to it)

build an sdist and wheel and then install each, and see where "data_file" goes.

  • for the sdist: /opt/data_file
  • for the wheel: ../site-packages/opt/data_file

on the other hand, relative data files get packaged into *.data/data and get installed relative to sys.prefix

Original comment by Michael Hoglan (Bitbucket: mhoglan, GitHub: mhoglan):


Graphite does a similar thing, not specifically their data files, but the lib files are specified in an absolute location (/opt/graphite/webapp) in the setup.cfg, and it results in the files being under site-packages/opt/graphite/... when you build a wheel and install it in a virtualenv.

When building from source, I would specify --install-options to change those locations to be relative to the virtualenv, but that does not seem possible to pass those options into pip wheel.

Removing the prefix / lib configurations in the setup.cfg cause the wheel and source installs to behave the same (ends up in site-packages); Altering the wheel and getting rid of the /opt/graphite/webapp at the top level achieves the same thing (since it would have assumed prefix of . at install);

btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that. I see this as more of having to work with projects that are not defined cleanly. And probably allowing there to be consistency between a src install and a wheel install.

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that.

Agreed!

I see this as more of having to work with projects that are not defined cleanly.

Well, actually the current problem is to work with package installers and virtualenvs that are defined cleanly!

Problem is that you may be able to put a data file somewhere using setup(data_files=xx) -- but can you determine where it went from your application instance!?

That's the main problem I'm facing with setuptools right now... when using setuptools, all paths for the data_files kwarg are relative to sys.prefix, but when installing in a virtualenv, they're not..

Original comment by Keerthan Jaic (Bitbucket: jck2, GitHub: jck2):


Is there a uniform way to find (relative) package data which works irrespective of whether the package is installed globally or in a venv?

Original comment by Joo Tsao (Bitbucket: nuwa, GitHub: nuwa):


need support setup(data_files=/opt/xxx)

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


For reference, this is bug is essentially the same as #120
And since pip 7.0.0 all packages are now wheeled before install, meaning that this bug and #120 are getting prime exposure in several packages.
See pypa/pip#2874

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@jck2 the simplest way for me is to only use package data effectively stored in a package directory side by side with the python code that needs them and never use data files.
Once you have this, dirname and __file__ will let you navigate to these data file locations relative to your python code location. Since the data is always in the same place relative to the calling code, the fact you are installed globally in a venv or else does not matter anymore.

As a simple example of this approach:

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


My work-around for pip 7.0 (because pip automatically creates wheels from sdists) is to include this in setup.py:

if 'bdist_wheel' in sys.argv:
    raise RuntimeError("This setup.py does not support wheels")

Pip will automatically skip the .whl packaging and run the normal sdist installation.

Why on earth this decision to make an unfinished packaging system deploy things that weren't intended for it by default is beyond my belief :( People who've made sdist installations, released them, and tested them, can create their .whl files themselves... this new bdist_wheel call prolonges the installation process and creates new unexpected behavior.

Original comment by Benjamin Reedlunn (Bitbucket: breedlun, GitHub: breedlun):


I just released my first python package, and it is affected by this issue. I would like to avoid absolute paths, as suggested, but I do not know the proper way. Can someone give me a hand?

Here is a link to my stack overflow question that goes into more detail.

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


This is a real problem for me as well.

My setup.py script works as expected with regard to data_files that use an absolute path and honors them when I do 'python setup.py install' however when I do 'python setup.py bdist_wheel' and then pip install my wheel the data_files that I specified with an absolute path and were correctly installed using a straight setup.py install ARE NOT installed correctly from the wheel and wind up relative to site-packages. I.e. site-packages/usr/lib/blah/blah

If I want to install a file outside of site-packages (say to an arbitrary place on the filesystem) I should be able to do that. The behaviour is inconsistent. I'd really like to see this fixed because right now I can't use wheels and that's exactly what I want to use.

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


@joe_code - I can recommend finding a workaround, not using setup.py's setup(). Ultimately, that's what we did, and to be honest and despite my previous harsh rhetoric in this thread, it's nice to get rid of data_files and have a Python project that works inside virtual environments again and can be distributed with Wheel :)

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


Hey Benjamin, thanks for your reply. Could you elaborate a little bit on your solution please?

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


In our case, we could factor out most of the files in /usr/share and turn them into "package data". The remaining files are now handled by OS installers (for instance debian packages, pkg for Mac, setup.exe for stuff Windows etc).

In case you don't want to create OS installers, you can have a "run first" approach for your application for which you do if not os.path.exists, possibly adding a file with your project's version in. The disadvantage is uninstallation.

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


I think it's a real problem that this decision has effectively broken a use case that many packages have relied in--in some cases for bad reasons, but in other cases for good reasons.

Although I personally feel like the reasoning behind the breakage has some merit, breaking things without offering some kind of guidance on how best to handle outside-Python resource files has created yet another sore point against Python packaging that has been raised by some of colleagues, and it's a valid complaint.

I think the argument "well we shouldn't just allow installing files to arbitrary system locations" is well meaning but ultimately spurious. It's true that, depending on what install_data gets set to, the paths which can be installed to is somewhat limited making it hard, say, to overwrite /etc/hosts. Yet pip will also happily overwrite executables in /usr/bin, for example, which I think is awful and it shouldn't. So really you're making a security-related argument that falls apart because there's actually no promise of security when installing a wheel system-wide (outside a virtualenv). Meanwhile it's possible to hand-craft wheels with files in the .data directory that can be installed almost anywhere within /usr at the very least.

I think a better approach would be to not make arbitrary decisions for software developers who know what they're doing, and where necessary protect users (and developers who don't know what they're doing) by not allowing pip to overwrite files that already exist on their system (especially for "data files").

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


Allowing absolute paths breaks the isolation of virtual environments.

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


So treat absolute paths as relative to the root of a virtualenv when installing in a virtualenv, and don't break their semantics on system installs.

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@embray but is pip aware of being in a virtualenv at all?

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


Yes, it has to be--especially to be able to deal with the nuances between virtualenvs with and without "global site-packages".

Original comment by Benno Fünfstück (Bitbucket: bennofs, GitHub: bennofs):


Is there a way right now to install some file into site-packages that works with both setuptools and bdist_wheel? For example, if I want to install a native library that is later loaded by my application. Or should I not use site-packages for that?

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@bennofs Yes there is a way: use package data.

if I want to install a native library that is later loaded by my application

I use this here for instance: libarchive native shared objects stored in these "package data" dirs:

... are then loaded here by ctypes:

Original comment by gerard56 (Bitbucket: gerard56, GitHub: gerard56):


pombredanne NA:

this seems a bit complex, also
[https://docs.python.org/2/distutils/setupscript.html?highlight=package_data](Link URL)
states that for package data files:
'the files are expected to be part of the package in the source directories'
so using a barely documented parameter to work around that seems a not very good thing.
(I can't find a mention of using include_package_data on said page of python.org; however there is a threatening warning in python code that using this parameter badly can lead to an infinite recursion, see setuptools/command/sdist.py ...).
As there is a simpler way (see below) why do it like you advise ?

As of the original problem, the said documentation states for the data_files:

'If directory is a relative path, it is interpreted relative to the installation prefix'
Nothing is said explicitly about what the software should do if the directory is an absolute path...

In fact what it is doing is that really the destination path of data_files is always relative at least when using pip wheel.

Not very intuitively, when a relative path is specified, the end result is installation higher in the file hierarchy

'/mypackage', 'myfile.txt' -> installation under <site/dist package>/mypackage

'/', 'mygoofylibrary.dll' -> installation under <site/dist package> (site if user, dist if admin)

If I understand correctly, this is what Benno Fünfstück was asking for.

'etc/myconfig', 'exemple-config.txt' -> installation under
/opt/local/etc/myconfig (or /usr/local/etc/myconfig or something like that...) if run as admin
.local/etc/myconfig if run as user

Here the python documentation is very misleading since the given example:

setup(...,
data_files=[('bitmaps', ['bm/b1.gif', 'bm/b2.gif']),
('config', ['cfg/data.cfg']),
('/etc/init.d', ['init-script'])]
)
will never work, nothing will go in the system /etc/init.d file.
as as been said here, the file init-script will end up under /etc/init.d

And this is indeed a very good thing.
If the sysadmin wants to put something in the /etc/init.d directory, the sysadmin should decide it, not having the decision taken by a python non-system package installed blindly. In most cases the /etc directory should stay managed by the computer packaging system.

So is there a consensus on this yet? Are changes required to the wheel codebase?

Don't know about consensus, but as far as I can see, the current wheel spec doesn't mention absolute paths, and offers no way to create wheels that unpack files to locations outside the package tree. The fact that setuptools allows data_files with absolute names to be specified doesn't alter this. Maybe we could warn (or error?) if bdist_wheel is asked to include a datafile with an absolute pathname. But otherwise, I don't see there's anything else to do as things stand.

If we wanted to do anything more then a new spec would be needed. PEP 491 shows how this might look, but as far as I know has never been approved or implemented. To be honest, though, the fact that absolute install locations break usage with virtualenvs, and are inherently platform specific, as well as being basically useless on Windows, means that I don't really have any interest in them personally.

Judging from that, it seems like the best approach is to abort the creation of the wheel entirely with a descriptive error message? How do I detect cases like this?

I'm concerned about Wheels supporting absolute paths because it affects application deployment.

For example, on *nix systems /etc is generally where configuration files for applications end up. While I don't really like this convention, it is going to confuse users who are used to it.

Also never mind installing such wheels on Windows :/

@qwcode Was your complaint about those files being installed relative to site-packages in the virtualenv, rather than the virtualenv root, or that you wanted them to ignore virtualenv isolation and get installed to their absolute locations?

My complaint was about them being installed via site-packages.

If there's a "proper" way to bundle python applications for deployment on systems, I would love to know. The documentation I've read covers very little on actions such as post-install actions, uninstallation, etc.

As it is, the behavior on how data_files works even for relative directories is different even between source and binary distributions!

Using pip install, all relative path declared in data_files get installed in the sys.prefix, for instance pip install --user mywheel.whl, it might get into ~/.local/shared/<relative/path>. The only way I found is to do error and retry on a well-known data file, to findout if I got to load from sys.prefix or from resource_filename(Requirement.parse("mypackage"), "relative/path")

Sorry if this is semi off topic, but is there any way to intentionally get data_files installed into site-packages?

In my virtualenv for instance it's lib/python3.5/site-packages. I expect that in the future I might end up with python3.6 so I shouldn't hard code lib/python3.5 as the output directory for data_files.

I ended up on this bug looking for a way to achieve installation of data files in lib/python<version>/site-packages. It seems that if I just prefix all my output paths with / they show up where I would like. Since this is marked as a bug, it sounds like maybe I shouldn't count on this "feature" being available in the future.

Is there a better way to get them installed in site-packages, if this is not the correct way?

Hi. I use this ugly code. It is really a pity not to have a simple method to get data in one shot. You "register" data/* in your setup.py, to retrive it you should do something like get_my_pkg_data("data/myfile") + all variation to support getting a file descriptor or a directory listing, ... and it would work not matter if the package in not installed (during development) or installed with pip install or pip install --user

@cheshirekow, @gsemet Have you considered the importlib.resources module, or for Python < 3.7, the setuptools resource manager API?

@cheshirekow, @gsemet Have you considered the importlib.resources module, or for Python < 3.7, the setuptools resource manager API?

Well what I'm trying to do is include with the package some extension modules (and some library dependencies) that are built by an external build system... so what I wanted was to get the extension modules somewhere on the pythonpath and their dependencies within a subdirectory of that location (so that RPATH works).

I don't think importlib.resources or setuptools resource manager can help here.

I could use package_data but then I have to either pollute the source tree, or copy everything into a "staging" area both of which I would prefer to avoid.

Please look at my link in my previous comment. Of course I use pkg_resource and it still does not do the job. I always see « use pkg_resource » but never a real code that actually works in real life.

Well what I'm trying to do is include with the package some extension modules (and some library dependencies) that are built by an external build system... so what I wanted was to get the extension modules somewhere on the pythonpath and their dependencies within a subdirectory of that location

This sounds like something milksnake does. Surely the packages under site-packages are on the PYTHONPATH, yes? What am I missing?

Please look at my link in my previous comment. Of course I use pkg_resource and it still does not do the job. I always see « use pkg_resource » but never a real code that actually works in real life.

I still don't understand what the problem is that you're having. Do you not put your data inside your python packages? Because then they're certainly retrievable using pkg_resources.

Yes during execution. But they are not during development (ex: pip install -e . in a virtual env)

Yes during execution. But they are not during development (ex: pip install -e . in a virtual env)

Why not? It should work exactly the same way regardless of where the package is located, so long as it's on the PYTHONPATH.

can you look at the code in Guake I had to make? If you find a way to get the image working in both dev environment and in the pipy package (pip install) AND in the distribution installation (make install) I would be glad to accept your mergerequest!

Surely the packages under site-packages are on the PYTHONPATH, yes? What am I missing?

That's correct, it is on the python path, and that is why I want my .so files to show up there. However, the only way I can tell to get them there is to use / as the directory in data_files. And this bug report suggests that this is unintended behavior. For me it is useful behavior, but if it is unintended I should probably not rely on it right?

However, the only way I can tell to get them there is to use / as the directory in data_files

Do you mean that if you install a wheel as root without a virtualenv, then you want your data files to be installed to /lib etc., directly under the root directory that is?

At any rate, is somebody here unsatisfied with how wheel packages data_files in a .whl file? Do you feel that it goes against the spec?

@agronholm The problem is not how wheel does it, the problem is that behavior differs between wheel and setuptools installation methods.

This is also why what @cheshirekow is doing is a bad idea: try installing your package not using wheels (pip will do this if the wheel package is not installed), and it'll fail: when not using wheels, a path like /xxxxx will install directly into the root directory, but when using wheels, it installs it into the site-packages dir.

@agronholm I believe reliably installing into site-packages is not easily possible. There are two hacky solutions I can think of, but no guarrantee that they work all the time:

  • use the / variant for wheels (detecting wheel build in setup.py is also only possible with hacks, such as inspecting sys.argv) and use absolute paths for non-wheel builds
  • try to generate the correct relative path such that sys.prefix + rel_path ends up in site_packages. I am not sure if that's even possible in all cases, not sure what configurations python allows (for example: could you configure python such that its libdir is in /lib/python... while the prefix is /usr?)

In any case, here's a few things I think you have to watch out for when testing your solution:

  • different python versions
  • wheel vs no-wheel
  • some distributions handle site-packages differently (for example, compare Arch vs Ubuntu. I think ubuntu uses dist-packages?)

I tried to do this once in https://github.com/bennofs/capstone/blob/23fe9f36622573c747e2bab6119ff245437bf276/bindings/python/setup.py, but this was too long ago so I can no longer say with confidence that this is the right approach.

I am thinking about making a getpkgdata library that would actually work in all cases. It only needs to detect if the current version is develop environment (pip install -e) or in a distribution package install.

The problem is not how wheel does it, the problem is that behavior differs between wheel and setuptools installation methods.

Regardless, this was opened as an issue on the wheel bug tracker. Therefore I need to know what I have to do in terms of modifying the wheel code in order to close this issue.

The crux of the problem is, IMHO, the fact that the wheel PEP speaks so vaguely about the .data directory and how it's supposed to work.

For reference: https://www.python.org/dev/peps/pep-0427/#the-data-directory

The reported problem goes all the way back to the invention of virtualenv and eggs. It is possible that wheel breaks it further, but the problem is that the "data" category (*.data/data/**) is installed in different places depending and we don't store which path that actually was when we install your program.
The way to fix it would be to allow custom folder names in .data/ with interpolated target paths, and a record of where those members were installed that a pkg_resources-type function could access.
Then the packager says "I would like to define /etc/hosts" but if you are installing into a virtualenv the file actually winds up in $virtualenv/etc/hosts or anywhere else depending on how the installer was invoked.
Several people have been interested in this kind of solution but so far not enough to actually write the PEP.

The gist of it seems to be that the behavior of .data is not documented anywhere and (some) people don't like how it works in practice. Correct?

Sounds about right - expanding on https://www.python.org/dev/peps/pep-0427/#the-data-directory, initially to capture current practice, and ultimately to document what people want to happen, in a way that gives installers enough detail to implement unambiguously, would be good.

https://www.python.org/dev/peps/pep-0491 made a start in that direction, but I don't believe it was ever finalised or approved - @dholth is that correct?

So IIUC, the data directory in wheels has never worked in a useful way, and even if it did work it wouldn't be any more useful than putting the data files into the package directory. Also, data-in-package-dir is what pkg_resources and similar tools have standardized on.

If that's right then I would deprecate and remove data, not try to fix it. There's nothing here to fix.

Where would you put things like man pages then?

If you need to package things like man pages, make a proper distribution package. IMHO those have no place in a Python package.

I agree. But I do think we need to clarify (somewhere) what does constitute a "proper Python package". People are using the packaging toolset for all sorts of things that often go beyond the description of "a basic Python library" (for example, pip itself is not a library, rather it's a command line application, but it's distributed as a wheel).

My informal view on what constitutes a "valid Python package" is:

  1. Must support installation in any Python environment, including the system Python, user-site, and a virtualenv.
  2. Must not need installation of any files in absolute locations, or OS-specific locations. It's perfectly OK to look for such files at runtime, but don't ship files that should be pre-installed.
  3. Must not require integration with OS services (e.g. installation as a system service, integration with system documentation services like manpages, registration with system package managers or in a system registry, ...)

If you fail any of these criteria (or can't work out some compromise of your own, like asking users to run a post-install script manually) then the Python packaging toolset isn't what you want. Of course, many people will use it anyway because the alternatives aren't that straightforward. But they might have to find their own fixes for things we don't support.

I'm happy if that's not actually what we choose to take as a definition - for example, if someone wanted to extend the wheel spec to support (in a suitably cross-platform way) absolute paths and/or locations for things like mampages, then I'd be fine with that. But until that happens, the above is my rule of thumb on what's in scope and supported for packaging.

2. Must not need installation of any files in absolute locations,

What about relative location, but not inside of site-package? I.e we need to install something into $VIRTUAL_ENV/xxx. What's the best way here? Still come up with the install script for user to run manually? I need to support installation into virtual environment, I don't need to support any other installation method, because of the package specifics.

@sashkab: Assuming that would work at installation time, what would the code to be able to use those resources at runtime look like?

The status quo discourages Python as an application development language, and I think that is a shame. setup.py didn't start out as a library management tool, but it became that during the decade when web development was the most important domain for Python, and no one noticed the broken setup.py features. If we respect the authors, packages should be able to contain anything that their authors want to put in. The strict separation between packaging and installation that we get from wheel also gives the person who uses that package complete control of how it gets installed.
I think in many cases it is more likely that the prospective Python application developer, writing a mostly-cross-platform application, will choose a different programming language whose dominant packaging tool better supports applications rather than try to make a distribution-specific package.
@pfmoore is correct that without further work a "valid Python package" has those enumerated properties.

  1. Must support installation in any Python environment, including the system Python, user-site, and a virtualenv.

I just want to note that the other 2 requirements pretty much follow from this one. If we want to support man pages, the way to do that is to extend the definition of a "Python environment" to include a man-pages directory, which would require figuring out what that means in all of these cases.

Wheels are a high-level representation of a Python package, abstracted over the specific details of the installation environment. If you want a high-level representation of an arbitrary application, that's just a different thing, and wheels are not well-suited to that problem. There are many other tools that are designed to solve that problem, like rpms, debs, MacOS/Windows application installers, etc.

If you have data you want to access at runtime, then we already have a standard and well-supported solution (it even works for packages installed in zips!): https://docs.python.org/3.7/library/importlib.html#module-importlib.resources

I've suggested that if a wheel contained a package-1.0.data/docs/
directory, that the installer could place those files into e.g. $virtualenv/share/docs/$packagename-
$packageversion by default. Imagine that plus a few more categories.

Indeed. If someone wanted to flesh out that proposal, put it into the form of a PEP/standard and get it approved and then implemented in the various tools, then that would probably cover a lot of the use cases I've seen mentioned in the past. Of course, no-one has yet volunteered to champion the suggestion. It really needs someone with an actual stake in the issue to step up, or it's going to forever sit behind other priorities.

So IIUC, the data directory in wheels has never worked in a useful way,

Wrong! The data directory (and data_files in setup.py) is useful in several ways. For example, it can be used to install Jupyter files such as Jupyter kernel specs or Jupyter notebook extensions (example). And I see nothing wrong with installing man pages or documentation using data.

What about relative location, but not inside of site-package?

That's exactly the use case that data_files solves.

it wouldn't be any more useful than putting the data files into the package directory. Also, data-in-package-dir is what pkg_resources and similar tools have standardized on.

You are confusing two different use cases for package_data and data_files.

package_data is useful for data files used by the package itself (or possibly other Python tools looking there).

data_files on the other hard is useful for data files used by other software (which may not even be written in Python).

what's this other software, that has nothing to do with Python, but it understands about Python environment layouts, including the data directory that even most Python software doesn't understand, but that doesn't know how to find package_data?

understands about Python environment layouts

"environments" are not specific to Python at all. Most open source software packages have a concept of installation prefix, analogous to sys.prefix. Conda for example installs everything (Python packages but also other packages) in a common prefix.

what's this other software

Jupyter packages are a good example. While many Jupyter kernels are written using Python, that is not a requirement: it is possible to implement the Jupyter protocol without Python. So they decided to use data_files for that, which makes it work the same way for Python packages and non-Python packages.

And the man pages example is also a good one (even though I personally don't know any Python package which installs a man page).

The consensus (?) seems to be that this needs a new standard and that wheel itself is currently not doing anything wrong. If someone wants this to be reopened, be specific about what changes are required for the wheel project. Otherwise a new issue could be opened when a new standard emerges that requires implementation here.