[FR]Parallelization of downloads
yurikoles opened this issue · 9 comments
Hi,
First thanks for such a great tool!
brew
uses curl to download artifacts, it's good for bottles and formulas, since hey are small and being downloaded from Bintray CDN. But this is not an optimal way for casks, that are much larger and being downloaded from third-party resources, which are very slow, especially for one download thread.
I often interrupt slow large downloads and do a manual download via aria2c in 16 threads, and I repeat this many times. Introducing custom downloaders like aria2c or some standard commands scheme like in ArchLinux or Gentoo will be a huge pain in brew or HBC. This question was already raised few times, even by me. But could we at least do the parallelization of downloads in this project?
My suggestion is to have an option that will tell cask-upgrade to download all binaries in front simultaneously using same standard curl method, then proceed to installation when downloads are finished. Is this possible?
Hey @yurikoles , thanks for raising this.
Currently we use "standard" brew commands to download / install all the casks. Downloading them in parallel would then require "bypass" that and download them before the cask install command is executed. The only "problem" is where to download the binary so that the cask install command picks that up.
But right now it seems doable to me.
Just a note here, unfortunately we can't run brew cask install
commands in parallel (as you need sometimes interactivity, how to handle the command output, sometimes password is needed etc) so the only way how to speed things up here is to "pre-download" all the binaries before running the commands.
brew cask fetch
to the rescue! I had tried to run it in parallel in two terminal tabs and found no conflicts.
So the first easy approach may be just to launch this command in parallel for each cask. But I have concern about terminal output. It may overlap, so we may want to hide it, which will also hide the progress. :(
Progress monitoring may be implemented as sum of current all downloading cask files size divided by total size, which may we may get by invoking curl for every cask URL like this:
curl --head --silent --location "$URL" | grep -i "content-length:" | tr -d " \t" | cut -d ':' -f 2
Just need to adjust it to get last line in case of redirect(-s).
brew cask fetch to the rescue!
@yurikoles nice, didn't know about that one. I would say as a first version we could simply remove the progress to see if that works.
I also noticed that there are multiple download strategies that are chosen based on a type of the particular cask, so we might "sneak in" aria2c
as a new strategy and use it if needed.
But I haven't looked into it too closely so not 100% if that is possible without major hacks.
i think it should better optimized at core homebrew and not at cu
?
That could be, however we can add our own "downloader" even here, but it's not as trivial. I looked into it back in the days, but didn't have much time to explore it further.
That definitely needs to be implemented by upstream and not in this tap.
For now, I use a one-liner to prefetch artifacts before upgrade:
brew outdated | cut -d' ' -f1 | xargs -P32 -I% brew fetch %
brew outdated
lists formulas and casks that have a new version.cut -d' ' -f1
extracts formula / cask names.xargs -P32 -I% brew fetch %
runs up to 32 parallel fetches with those names.