Fetching complete repo for single image fileset is inefficient
ivan-section-io opened this issue · 2 comments
During server provisioning we see extra ~1 minute delays whilst ~1GB git data is downloaded from https://images.packet.net/packethost/packet-images.git
As this repo grows, we get more and more slow down (this is all before git checkout, thus excludes final image large file download LFS/caching).
Contributing to this:
a) This repo has a lot of stuff beyond the final images for servers booting (idk, maybe build tools?)
b) It fetches all branches (could be single ref)
c) It fetches all history (could be shallow)
Historically fetch by commit (uploadpack.allowReachableSHA1InWant) was not well supported - it is now (including GitHub, I believe), and a shallow single commit fetch is much quicker. (Deploy script could always try direct commit fetch, and fall back to all branches if git service doesn't support it).
I'm not sure of the exact OSIE script running at the moment, but I'm assuming it's close to:
Line 160 in 1b5ea0e
Example (Run from Packet SYD2)
gituri=https://github.com/packethost/packet-images.git
image_tag=82dfba29f7aa462651c2e96521ed24bcad726330
#Existing fetch-all
time git -C $assetdir fetch origin
#Receiving objects: 100% (91877/91877), 889.05 MiB | 19.89 MiB/s, done.
#real 0m51.687s
#user 0m23.772s
#sys 0m5.012s
#imageid fetch
time git -C $assetdir fetch --depth 1 origin "${image_tag}"
# remote: Total 9 (delta 0), reused 6 (delta 0), pack-reused 0
#real 0m2.982s
#user 0m0.080s
#sys 0m0.012s
Ticket reference NYDE-2114-IUHD
This appears to have been addressed by 159a68b
(Noticed it in DFW2 during a boot today)