tinkerbell/osie

Fetching complete repo for single image fileset is inefficient

ivan-section-io opened this issue · 2 comments

During server provisioning we see extra ~1 minute delays whilst ~1GB git data is downloaded from https://images.packet.net/packethost/packet-images.git

As this repo grows, we get more and more slow down (this is all before git checkout, thus excludes final image large file download LFS/caching).

Contributing to this:
a) This repo has a lot of stuff beyond the final images for servers booting (idk, maybe build tools?)
b) It fetches all branches (could be single ref)
c) It fetches all history (could be shallow)

Historically fetch by commit (uploadpack.allowReachableSHA1InWant) was not well supported - it is now (including GitHub, I believe), and a shallow single commit fetch is much quicker. (Deploy script could always try direct commit fetch, and fall back to all branches if git service doesn't support it).

I'm not sure of the exact OSIE script running at the moment, but I'm assuming it's close to:

git -C $assetdir fetch origin

Example (Run from Packet SYD2)

gituri=https://github.com/packethost/packet-images.git
image_tag=82dfba29f7aa462651c2e96521ed24bcad726330

#Existing fetch-all
time git -C $assetdir fetch origin
#Receiving objects: 100% (91877/91877), 889.05 MiB | 19.89 MiB/s, done.
#real 0m51.687s
#user 0m23.772s
#sys 0m5.012s

#imageid fetch
time git -C $assetdir fetch --depth 1 origin "${image_tag}"
# remote: Total 9 (delta 0), reused 6 (delta 0), pack-reused 0
#real 0m2.982s
#user 0m0.080s
#sys 0m0.012s

Ticket reference NYDE-2114-IUHD

This appears to have been addressed by 159a68b

(Noticed it in DFW2 during a boot today)

Yes this is fixed, I will close this issue for now. We are also working on #2 that will make osie download faster.