Rough estimate - how much disk space will the static server need?
DilumAluthge opened this issue · 16 comments
If I wanted to use the static server option (i.e. just store everything on disk) for the entire General registry, how much disk space would I need, ballpark estimate?
Currently about 16G after “the great purge”. Before the great purge it was about 35G.
That’s a lot less than I would have guessed.
Does that include BinaryBuilder binaries? E.g. if I Pkg.add("MbedTLS")
, that includes the MbedTLS binaries that get downloaded during the build step?
Yes, it includes both packages and all artifacts referenced by them. It is compressed though.
It does not include anything that deps/build.jl
files download, however. Only artifacts.
Ahhh.
Packages that use BinaryBuilder/BinaryProvider download the binary tarballs during the build step inside the deps/build.jl
step, right.
So that’s going to be a problem for me. I’d really like to be able to use PkgServer to serve up everything.
Here’s my use case.
I have a server with no Internet connection whatsoever.
However, there is a mechanism for transferring files onto that server.
So my idea was to generate a static PkgServer (on an Internet-connected machine). Then I would transfer all of those static files into my no-Internet server. Then I would run PkgServer on the no-Internet server using those static files. And then I can install Julia packages on my no-Internet server simply by pointing my Julia at my PkgServer.
FWIW, JuliaTeam has the ability to do this. For open source Julia, I think the way forward has to be encouraging everyone to use Artifacts.
I guess this is my confusion: What is the difference between Artifacts and "anything that deps/build.jl
files download"?
Several people (@giordano, @KristofferC, @staticfloat) tried to explain it to me on Slack, but I am still confused.
Because Slack is ephemeral, I'll post these helpful explanations by Elliot and Mose:
deps/build.jl
files are completely freeform Julia code; very difficult to ahead-of-time figure out what they're going to download, or even what they're going to do. They can straight up callHTTP.download()
and do whatever they want. They can download three different files, combine them into a fourth file, and save that on disk.
Artifacts are declared within a TOML file, which is very easy to parse and understand. Determining what artifacts are associated with a package is very easy, and
Pkg
can scrape this information without running any package code
BP uses
deps/build.jl
BinaryProvider uses a half-way point, a fairly rigid
build.jl
file. Artifacts do not usebuild.jl
files (although they can be used bybuild.jl
files) they useArtifacts.toml
files.
Julia 1.3 has a
Pkg
that can read and use theseArtifacts.toml
files
Read this for more detail: https://julialang.github.io/Pkg.jl/dev/artifacts/
BP will be dead in the future
BinaryProvider
has been more-or-less eaten byPkg
; its capabilities are builtin, so moving forward everything generated byBB
will be installed byPkg
.
this read might help: https://github.com/JuliaLang/www.julialang.org/blob/43f5244c36cc8ec6e1728c697cf5de652b41e8fd/blog/_posts/2019-08-01-artifacts.md
When you say "everything generated by BB will be installed by Pkg ".
You mean Artifacts?
Yes, artifacts alongside JLL packages
Basically your binaries and wrapper code around in the form of autogenerated julia packages get installed by Pkg at
Pkg.add()
time, thereby eliminating the need for adeps/build.jl
file at all, in most cases.
Woo-hoo. And presumably, PkgServer.jl will be able to serve both Artifacts and JLL packages.
Yes, that's correct. PkgServer is built to allow for installation of JLL packages (they are just normal Julia packages, after all) and the Artifacts that are associated with those julia packages.
Right, so in summary, we're moving towards making everything that a package needs to download completely declarative via Artifacts.toml
files. For most packages this should eliminate all need for deps/build.jl
because BB makes it possible to create pre-built platform-specific binaries and use them without any compilation step. The client side of this system is that Pkg can select the right variant of an artifact based on platform details of the client system, but all of the variants are already built—it's just a matter of downloading the right one when the package is installed.
Even if some packages do still need a deps/build.jl
step after this, they shouldn't need to download anything during the build step since that can be taken care of by the artifacts step, which runs before the build step. So you can figure out in advance (before running any package code) for a package using artifacts everything that the package needs to download in order to work correctly.
we're moving towards making everything that a package needs to download completely declarative via
Artifacts.toml
files
I'm not sure this is entirely true. The PR for Gtk generates a MutableArtifacts.toml
file after running gdk-pixbuf-loaders-cache
locally on user's machine.
I haven't really discussed the mutable artifacts design with @staticfloat but my impression is that they are used to refer to locally generated artifacts, not pre-generated ones that are downloaded.
Yes, they're generated locally by gdk-pixbuf-loaders-cache
. I think I overlooked the "needs to download" bit above
Yep; that MutableArtifacts.toml
thing is a workaround until we have better support for "caches" of data, as briefly described in JuliaLang/Pkg.jl#796 (comment)
Can I close this? There doesn't seem to be anything further actionable here.