pfultz2/cget

Recursive GitHub clones on packages?

thoughton opened this issue · 7 comments

Hi,

I'm trying to create/install a package from a GitHub repo (that I do not own). The repo already uses CMake as its build system, so that's a positive start at least.

However, it seems the repo also uses git submodules, and therefore expects to have been cloned using the --recursive flag so that its submodule dependencies are also cloned.

Is there any way of telling cget to do a full recursive clone of the repo, instead of (I believe) downloading the zip archive (which does not include any submodules)?

If not, do you have any suggestions on an alternative approach I could try?

Thanks!

Cget never calls git clone, it just downloads the source archive. Here are some thing you can do:

  1. Use the actual release distribution instead. For example, boost requires git modules, but you can download the complete source as part of the release, here. A library should provide source tarballs as part of the release, if they don't you should file a bug report.

  2. Sometimes libraries use submodules for a poor dependency management. If this is the case, the library should look for the dependencies if the submodules are missing, which means you can have cget install the dependencies. If this is not the case, you should file a bug report.

Finally, if neither of those are options, you can clone or download the missing pieces in cmake. You can create a cmake file('build.cmake') which does something like this:

# Clone the submodules manually
execute_process(COMMAND git clone -b master git@github.com:pfultz2/cget.git ${CMAKE_CURRENT_SOURCE_DIR}/cget)

# Include the original cmake file
include(${CGET_CMAKE_ORIGINAL_SOURCE_FILE})

And install the library with cget install -X build.cmake <url>. Or you can create a recipe to do this, see here for a simple example.

Hey, thanks for the info!

In this case the include(${CGET_CMAKE_ORIGINAL_SOURCE_FILE}) line was the missing part of the puzzle that let me solve my issue.

I've created a pull-request on the cget-recipes repo for my new recipe here:
pfultz2/cget-recipes#3

What about adding a switch that tells cget to git clone from github instead of downloading their stupid tarballs that don't bundle submodules?

What about adding a switch that tells cget to git clone from github

No not directly, as cloning is much slower and when cloning recursively there isn't a simple way to shallow clone the submodules. I will probably add a way to add git urls(like git://github.com/brunocodutra/metal.git) in the future.

instead of downloading their stupid tarballs that don't bundle submodules?

An open-source project should provide source tarballs for release, it doesn't necessarily need to be from github.

Furthermore, submodules are usually used as a poor-dependency management, which requires the user to download the dependency multiple times when used in multiple projects. And this is the case when the user installs both metal and alloy. Instead, the cmake-utils should be made installable and found by find_package(cmake-utils)(that is how BCM works). When its not found, it can fallback on the submodule. This way cget can install the cmake-utils once, and it can be used both by metal and alloy.

No not directly, as cloning is much slower and when cloning recursively there isn't a simple way to shallow clone the submodules.

Right, but the user get's what the user asks for.
To be clear, I propose this to be an explicit command line switch, not an implicit fallback.

An open-source project should provide source tarballs for release, it doesn't necessarily need to be from github.

For standalone header only libraries I argue that there is no clear definition for a source release, indeed for Metal it could be argued that the metal.hpp is a source release. In fact, the recipes for Metal and Alloy that I propose in pfultz2/cget-recipes#6 and pfultz2/cget-recipes#7 express exactly this.

Furthermore, submodules are usually used as a poor-dependency management

I agree in general, but you can also think of cmake utilities as a private implementation detail, not a public dependency. For instance, the repository I set up as a submodule only contains undocumented helpers that work for me and I don't want to ship them publicly to users. I also don't want to maintain multiple copies of it as part of different projects, so I do think in this case submodules are fine and github ought to better support them.

Do you happen to know whether one can omit the tarballs automatically generated by Github on releases? I'd love to provide my own that bundles the submodule, but it hardly does any good to upload them besides the ones already listed there.

To be clear, I propose this to be an explicit command line switch, not an implicit fallback.

Right, which supporting git:// urls will work for this.

For standalone header only libraries I argue that there is no clear definition for a source release, indeed for Metal it could be argued that the metal.hpp is a source release.

It could be, but its also nice for libraries(even header-only) to provide find_package(Metal) so it can easily be consumed by other cmake users(which previous versions of metal provided).

I agree in general, but you can also think of cmake utilities as a private implementation detail, not a public dependency.

Its a build-only dependency, which can be expressed with cget in a requirements.txt file with --build.

I also don't want to maintain multiple copies of it as part of different projects, so I do think in this case submodules

Its still a dependency even when using submodules. Its just now the user has to download this dependency multiple times, instead of using a dependency tool to manage it.

Do you happen to know whether one can omit the tarballs automatically generated by Github on releases? I'd love to provide my own that bundles the submodule, but it hardly does any good to upload them besides the ones already listed there.

Do you happen to know whether one can omit the tarballs automatically generated by Github on releases?

Hmm, I don't know, what happens if you try to upload a file with same name?

I'd love to provide my own that bundles the submodule, but it hardly does any good to upload them besides the ones already listed there.

Yea, but having them available even with github's broken tarballs is better than not having them at all.

Thanks for your insight, in fact your feed back made me rethink this whole submodule nonsense!

It turns out I figured out a way to achieve exactly what I needed, that is, share cmake implementation details between several projects, without the need for submodules, but using subtrees instead. It's great because users need never know anything about shared implementation details and, because of that, integration with github and hence with cget works out of the box.