koreader/koreader-base

koreader cannot build without network access (seeking feedback for PR)

amjoseph-nixpkgs opened this issue · 20 comments

Summary

I believe that koreader does not provide any way to compile it without network access (if this is incorrect please let me know and I will close this issue). I would like to submit a PR which adds this ability.

Details

I spent quite a bit of time last night trying to modify the build process. It doesn't seem like there is any clean or non-invasive way to do this, so before putting a huge amount of time into this I wanted to check with folks here to find out if people have strong feelings on how this should be done, suggestions, ideas, or religious objections to the idea of even offering this capability.

First Attempt (Not preferred)

To get an idea of the direction I was headed, see here and here. CMake seems to make this incredibly difficult! You'd think that since it offers the ability to auto-download external projects, it would also offer a flag or environment variable that says "just do the download and nothing else" without having to edit CMakeList.txts, but that doesn't seem to be the case.

My strategy there was to add a DOWNLOAD_ONLY environment variable which causes the build process to run normally but skip over all the non-file-downloading parts, and another environment variable ALREADY_DOWNLOADED that (this part not implemented) gets plumbed in to CMake's rudimentary -DEP_UPDATE_DISCONNECTED:BOOL=ON feature -- which unfortunately still requires a huge number of CMakeList.txt edits. The whole approach ended up seeming pretty fragile, since it basically runs the build process twice while trying to trick cmake into skipping certain steps during each of the two runs.

I don't use CMake much, so if I've missed some obvious way to do this please let me know.

Second Attempt (Preferred / Proposed)

After all this I began wondering if it would be better all around to move the downloading out of CMake and just use simple curl invocations from the Makefile, and change the ExternalProject_Add(DOWNLOAD_COMMAND) to something like cat already-downloaded/tarball.tgz. At this point I would prefer to take this approach, but since moving the downloading out of CMake and into a Makefile is a pretty major change I figured I should check here first to see if that approach is acceptable.

Motivation

The koreader package in nixpkgs currently downloads binaries rather than building from source, something which is strongly frowned upon. I would like to have it build from source.

For many very good reasons, some of which involve security, nix blocks any kind of network access from build processes. Packaging must separate fetchers from builders. A builder has no network access, whereas a fetcher (i.e. a FOD) must know the exact hash of its entire result before it begins executing.

In theory we could write fetchers for each one of the 50 projects which are vendored into koreader/base/thirdparty, but something like that would require constant maintenance in nixpkgs and probably break with each koreader release. So, in the nixpkgs community, it is generally preferred to politely ask upstream if one of us (i.e. me) could write a PR to separate the downloading step from the compiling step, and have the PR merged.

Warning: nix is highly addictive. If you haven't learned it yet, please don't start because of me. Once you truly understand it you will find yourself unable to manage software any other way. Turn back now.

CMake is a major PITA, so, err, first of all: good luck ;).

Second, what's the usual approach taken in Nix for stuff relying on CMake ExternalProject?

Third: keep in mind we require an even crappier version of CMake because old Debian (might be time to bump it a bit, can't recall where we discussed this last, but, IIRC, the cut off was still something fairly sucky).

Second, what's the usual approach taken in Nix for stuff relying on CMake ExternalProject?

Out of >80,000 packages there are only 13 uses of ExternalProject_Add() which need explicit intervention.

Three cases are in arcan, which is a good example of what we would have to do to koreader... except instead of three patches and fetchers we're talking about 50 patches and fetchers expressions. I would face strong pressure to work with upstream instead, which is why I came here first.

No other package uses a non-optional downloading ExternalProject_Add() more than once. In the other 10 cases the external project is for some optional feature that upstream didn't offer a way to disable, so we carry a patch which disables the feature.

I would say that using ExternalProject_Add() for non-optional features is very, very unusual. I believe that Debian will reject packages that vendor dependencies like that.

CMake is a major PITA, so, err, first of all: good luck ;).

Yeah, that is part of why I have never learned much about how to use it.

keep in mind we require an even crappier version of CMake because old Debian

You mean that your "minimum supported CMake version" is suckily-low, not that you have a "maximum supported CMake version", right?

At this point I would prefer to take this approach, but since moving the downloading out of CMake and into a Makefile is a pretty major change I figured I should check here first to see if that approach is acceptable.

Heed extremely well why it was moved out of the Makefile to CMake, i.e., changing something in CMakeLists.txt now leads to everything being updated as desired. This works (reasonably) well. Changing to curl doesn't sound like it would.

Wrt CMake, it's currently 3.5, that will become 3.13 or 3.16 in the not too overly distant future but given that I have to move don't count on it too soon.

PS That's 3.5 as in Ubuntu 18.04, Debian Stretch would be 3.7.

Three cases are in arcan, which is a good example of what we would have to do to koreader... except instead of three patches and fetchers we're talking about 50 patches and fetchers expressions. I would face strong pressure to work with upstream instead, which is why I came here first.

You can also replace ExternalProject with something custom. I already did as an experiment but figured I wouldn't pursue it further because since 3.11 it's better.

For more background, see #353

Anyway, I was now in a position to quickly browse through the docs and I think https://cmake.org/cmake/help/latest/module/FetchContent.html should probably do what you want? With stuff like FetchContent_Declare/FetchContent_Populate/FetchContent_MakeAvailable… I don't know exactly, see the docs. Cf. what I wrote in #851, There's a decent chance I intend to switch over to FetchContent regardless.

NB That means merging will be off limits for the time being. But a proper solution that requires a minimum of CMake 3.11 even though it can't be merged right now is preferable to a hackfest, I'd say.

PS If FetchContent_MakeAvailable is the answer, that means the minimum required CMake will become 3.16 (i.e., I see no point in 3.14 if we're going beyond 3.13 as a minimum).

Anyway, I was now in a position to quickly browse through the docs and I think https://cmake.org/cmake/help/latest/module/FetchContent.html should probably do what you want?

Yes, that would be perfect, so long as FETCHCONTENT_FULLY_DISCONNECTED works. Then we can download the tarball and point koreader's build scripts at it using -DFETCHCONTENT_SOURCE_DIR_EXT_PACKAGENAME:STRING=.

Sounds good. Do you have time to put together a poc for a single one to see if it works as desired?

Fyi, minimum's just been upped from 3.5 to 3.10, 3.16 coming soon.

Edit: actually @pazos do you know which version is used by F-Droid?

pazos commented

No idea.

Sounds good. Do you have time to put together a poc

Well, my first attempt at getting CMake to do this (see above) was a major struggle which ended in defeat. If this has to be done inside of CMake and I will be the one writing the PR then I need to invest some time in learning CMake. I am leaving town tomorrow but can look into this when I return on Monday.

for a single one to see if it works as desired?

If I do one I will do them all and submit a PR. 95% of the effort is in getting the first one to work correctly.

minimum's just been upped from 3.5 to 3.10, 3.16 coming soon.

I interpret this to mean that it is okay to use CMake 3.10 features immediately, and that CMake 3.16 features may be used although that might slightly delay the integration of the change. Please let me know if I have misunderstood.

If I do one I will do them all and submit a PR. 95% of the effort is in getting the first one to work correctly.

I meant primarily to avoid wasting time (regardless whose) to make sure it does exactly what we all want. :-)

I interpret this to mean that it is okay to use CMake 3.10 features immediately, and that CMake 3.16 features may be used although that might slightly delay the integration of the change. Please let me know if I have misunderstood.

No, using 3.10 features is definitely not okay. ;-) Go for 3.16. The features we need require at least 3.11, probably at least 3.14. If I find the time I'll have 3.16 rolled out tonight or tomorrow, otherwise it might take a little longer but either way coming soon.

After much struggling and fighting with cmake, I have arrived at three conclusions:

  1. I am not as smart as the people who wrote cmake.
  2. I am not as smart as the people who wrote koreader's build scripts.
  3. Koreader's build scripts already have the ability to build without network access! _ko_write_gitclone_script won't attempt to access the network if the vendored dependency's git repository is already cloned, with the correct commit or tag, in the expected location prior to the start of the build. This is all we need for nixpkgs. Therefore, no changes to koreader are required!

Koreader's build scripts already have the ability to build without network access!

I misunderstood the (nix) problem, because, well, yeah, that's make fetchthirdparty (ahem, as mentioned right at the top of the readme ;-P).

that's make fetchthirdparty

No, you misunderstand. We download the files beforehand and put them in the right place. make fetchthirdparty is never run (or, if it is run, it does nothing).

Here's what it looks like: NixOS/nixpkgs#178320

No, you misunderstand.

Downloading and building are two separate actions conceptually. Whether you reimplement one of those steps from scratch is otherwise immaterial. I thought there was something about make fetchthirdparty that didn't behave exactly as you wanted due to some obscure ExternalProject issue. If I'd known you wanted to reimplement it from scratch I'd have said to just do that; that's all.