Find modules should mainly require <Pkg>_DIR to be set
schuhschuh opened this issue · 9 comments
When the CMake find_package
command runs in CONFIG mode, it adds a cache variable named <Pkg>_DIR
which points to the location of the package configuration file. This works of course only for packages that provide such CMake configuration file.
Same should be the case for custom Find modules. Usually it would be sufficient if a user sets this variable to the prefix where the package is installed. All other paths should be derived from it and when cached, be declared as advanced variables unless a file/directory was not found. As soon as it is set, the variable can be hidden (again) by making it advanced. This reduces the number of non-advanced variables a user is required to set for all dependencies to be properly discovered. It also leaves a more common "experience" for a user, no matter if a custom Find module is used or not. All they have to do is make sure that the "_DIR" variables are set properly.
Environment variables such as MATLABDIR
(commonly set on systems where MATLAB is installed) or MATLAB_DIR
could also be considered to initialise the <Pkg>_DIR
variables.
Here's my comment to #33:
The purpose is to make use of a good practice in how to write Find package modules and to reduce the number of variables and standardise the experience for a user for all packages. As CMake is very flexible and provides little guidelines on how to write a good Find package module, many people have done it differently.
I claim that many Find modules you will find are not very user friendly for those not proficient with the internals/use of CMake. Even CMake's Find modules and find_package itself lack in this respect.
Why do I have to set a bunch of variables for each project (e.g., include path AND location of each single library), just because I installed the package in a non-custom directory ? Why is it not enough to just tell CMake where my installation is located instead ? Next, when I want to change from one installed version to another installed in a different location, why do I have to change all those variables manually. Why is it not enough to change the location of the installation ?
To standardise the finding of dependencies was part of the CMake BASIS project. In my experience with the research software we published, this has made it easier for non-CMake experts to build our code.
For every dependency, I want the user to only have to specify the installation directory via a <Pkg>_DIR
variable. All other CMake or the custom Find modules should be able to derive from it. Only if that fails, the users needs to modify the advanced settings. Also, unlike CMake's default behavior, users often fail to realise they have to specify the path to the package Config file, which indeed is much less intuitive then specifying the installation prefix (top-level directory such as /usr/local
). This can be solved by setting some CMake variables properly. Done by CMake BASIS.
Using a <Pkg>_DIR
variable as main setting is in line with CMake's find_package CONFIG mode. It should be transparent to the user whether a dependency is discovered via Config file or Find module.
Eigen now ships with a CMake config file
I have mentioned before that this doesn't help because I don't even want to configure and install Eigen. It is a header-only library. I doubt the source tree of Eigen already contains a usable CMake package Config file.
I am not convinced that shadowing the behaviour of standard CMake find_package is a good solution.
I don't think it has anything to do with shadowing. It is about having users not to be familiar with CMake and it's find_package command behavior. And as mentioned above, with a simple variable setting, the <Pkg>_DIR
variable can be set to the installation directory.
Maybe it would be good to take advantage of this refactoring period to use Basis in the IRTK build process.
@schuhschuh do you think you could give us a crashcourse on that / help with the migration? I guess you're already using it in your fork/branches?
I have started to work on this with Stefan in his GitLab repository, but it has never reached a point where all my development could take place. I haven't done it myself because it is only worth the effort if there is commitment towards using CMake BASIS for the official IRTK version.
As from what I see, it would have a few advantages:
- Collection of (additional) Find modules commonly used in our communities that can be extended as part of CMake BASIS and try to address herein mentioned shortcomings of CMake.
- Save use of glob expressions for discovering target source files.
- Support for project modularisation (i.e., the IRTK Packages[/Modules]).
- Support for IRTK Package template instantiation for a quick start of writing a new package.
- Support for project documentation/manual generation (html + PDF) from ReStructuredText.
- Support of MATLAB Compiler and MATLAB MEX-file targets for (experimental) IRTK Packages.
And, features I would more likely take time to finally realise when actually used:
- Transparent super-build of dependencies if software not found on the users system
- Slim down of CMake BASIS such that a project can use only the CMake modules e.g. via Git submodule integration if other components are not needed/desired
Another feature of CMake BASIS which may be controversial and need not be adopted:
- Ability to write pipeline scripts which call IRTK tools using the build target name instead of the actual executable name (which could differ between Unix and Windows). It also does not require a setting of the PATH environment variable which helps to avoid conflicts between installed versions of IRTK.
@jopasserat @schuhschuh I am happy to discuss usage of BASIS at some point but I feel we are drifting away from the initial issue that Andreas is raising, which is to use <Pkg>_DIR
consistently in our find modules.
My answer is no for the following reasons:
- The variable
<Pkg>_DIR
is used by config-mode and is expected to point to the location of the cmake config files. Changing its behaviour for non config-mode is counter-intuitive. Worst it is confusing if the library in question eventually decide to ship a cmake config file from one version to another (like with Eigen 3.2 -> 3.3).
As CMake is very flexible and provides little guidelines on how to write a good Find package module, many people have done it differently.
- And I don't intend to maintain a diff for our external files due to a short-coming of CMake. Plus, if you can install a library in a non-standard location, there is a high chance you can figure out how to pass that location to the relevant cmake find package. Not worth maintaining a diff with the upstream files for that matter.
Why is it not enough to just tell CMake where my installation is located instead ?
- Because the location is subject to interpretation. A header-only library like Eigen? header+libs like NifTICLib? header + optional libs like Boost? Plus the range of different prefixes for the header and libs locations...
I can understand why CMake did not bother trying to formalize this, they can't go on a quest to fix the install scripts of the whole dev community. And neither would I. pkgconfig
was supposed to be the solution to this and its adoption is still low.
I apologize for the dry tone but I really feel we are bikeshedding here.
Because the location is subject to interpretation. A header-only library like Eigen? header+libs like NifTICLib? header + optional libs like Boost? Plus the range of different prefixes for the header and libs locations...
You're missing the point. It is not necessary that all packages follow the same installation scheme. It is about the Find modules written specifically for a package that should encapsulate that knowledge.
The most common location every software has in common is the installation prefix (i.e. top-level directory of the installation).
The variable _DIR is used by config-mode [only]...
At least for CMake before version 3, this definitely was not the case. Many of the Find modules used such variable themselves.
...and is expected to point to the location of the cmake config files. Changing its behaviour for non config-mode is counter-intuitive.
I couldn't disagree more. First of all, you assume that a user knows that such variable is only meant to be a path where a Config.cmake file is located. A user shouldn't know such thing, which isn't even true. CMake has no such mechanism of requiring that a variable of such name is only used like this.
I apologize for the dry tone but I really feel we are bikeshedding here.
Sure, one can consider it petty details. But it's the petty details for something as a project structure and build configuration how easy and reliable it is to build it from source code for non-software developers. IMHO the important aspect of packaging (the refactored) IRTK was it's build configuration (i.e., CMake stuff). I know it's not the most exciting for a software developer, but it can make a real difference for the user.
Look at the current setup. When I configure IRTK, I am overwhelmed with dozens of CMake variables in my CMake GUI, hiding all the actually relevant things. Instead of a single variable for each package that is intuitive (not counter-intuitive as you say). Why would you think as somebody not using CMake for your own development that "IRTK_DIR" is supposed to point to "/usr/local/lib/cmake/IRTK" instead of "/usr/local", the CMAKE_INSTALL_PREFIX I had used to install it ? Why do I, as a user that tries to install some software have to know where each dependency hides it's CMake package Config file ? Or let's say I install software in /opt/<pkg>-<version>
which is in accordance with the Linux FHS. Then wouldn't it be more intuitive to set <Pkg>_DIR
to /opt/<pkg>-<version>
? I don't even want to know where the <Pkg>Config.cmake
file is located within this directory tree. This I consider knowledge only the developer of that package or the software that uses it needs to know.
The variable _DIR is used by config-mode and is expected to point to the location of the cmake config files. Changing its behaviour for non config-mode is counter-intuitive.
That this statement is inexact and CMake itself is not consistent with respect to this can be easily seen from the following text you can look up in the documentation of find_package:
The set of installation prefixes is constructed using the following steps. [...]
1. [...]
2. Search paths specified in cmake-specific environment variables. [...]
<package>_DIR
[...]
From this you should conclude that CMake interprets an environment variable such as IRTK_DIR
to be the installation prefix (e.g., /opt/irtk-2.0.0
) of the IRTK package, not the location of it's CMake Config file (e.g., /opt/irtk-2.0.0/lib/cmake/IRTK
).
In CMake BASIS, the use is consistent and intuitive for non-CMake users. Still, those familiar with CMake can as well point <Pkg>_DIR
directly to the location of the package Config file.
You have me lost complete attention to your points on the following occasions:
Sure, one can consider it petty details.
My point, priority should be put elsewhere.
A user shouldn't know such thing.
He or she still has to know where this variable should point to regardless of the naming convention you are discussing.
I know it's not the most exciting for a software developer
I am not sure what this implies and I am not certain I want to know. Moving on.
When I configure IRTK, I am overwhelmed with dozens of CMake variables in my CMake GUI
So this is really what this issue is all about ? The number of variables is not pleasing to your eyes ?
And last but definitely not least the piece of documentation you decided to partially quote. This whole section is meant to highlight usage of inhibition variables to let the user tweak the search paths for CMake config files / find modules:
The set of installation prefixes is constructed using the following steps. If NO_DEFAULT_PATH is specified all NO_* options are enabled.
An then each step highlighting which specific variable should be enabled:
1. Search paths specified in cmake-specific cache variables. These are intended to be used on the command line with a -DVAR=value. This can be skipped if NO_CMAKE_PATH is passed:
[...]
2. Search paths specified in cmake-specific environment variables. These are intended to be set in the user’s shell configuration. This can be skipped if NO_CMAKE_ENVIRONMENT_PATH is passed:
[...]
The relevant piece of documentation you should be quoting is at the beginning and is the following:
Config mode attempts to locate a configuration file provided by the package to be found. A cache entry called <package>_DIR is created to hold the directory containing the file.
I have no further interest in discussing this issue and would appreciate if contribution efforts were directed elsewhere. Thanks.
So this is really what this issue is all about ? The number of variables is not pleasing to your eyes ?
Don't come like this. It's not about me or how pleasing I find it. Keep the user in mind. I have made some experience with users with publicly available research software as well as other researchers in two labs now. I am certain they would greatly appreciate if we would take such issue more serious.
And last but definitely not least the piece of documentation you decided to partially quote.
Of course I post only the relevant part, to show you that <Pkg>_DIR
variables are not exclusive to point to the location of a Config file. I've shown this and that's all I wanted.
if contribution efforts were directed elsewhere
Fine. I might else well direct them towards my own fork.