Check size of downloadable images to warn users they may run out of disk space
razr opened this issue · 7 comments
I try to run the Nvidia Orin image.
It turned out there is not enough space inside the docker container to unzip it.
-rw-rw-r-- 1 user user 11156299681 Mar 20 04:51 jp511-orin-nano-sd-card-image.zip
-rw-r--r-- 1 user user 22144876544 May 4 15:18 sd-blob.img
I have removed Android
and dotnet
as
- name: Increase free space
# Remove Android and dotnet
run: |
sudo rm -rf /usr/local/lib/android
sudo rm -rf /usr/share/dotnet
df -h
After that unzip
works, but it fails to mount it.
Created loopback device /dev/loop3
/dev/loop3: gpt partitions 2 3 4 5 6 7 8 9 10 11 12 13 14 1
mount: /home/actions/temp/arm-runner/mnt: wrong fs type, bad option, bad superblock on /dev/loop3p2, missing codepage or helper program, or other error.
~/tmp$ fdisk -l sd-blob.img
Disk sd-blob.img: 20.62 GiB, 22144876544 bytes, 43251712 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E078F7E2-0752-4DF8-8B74-6A5CD5C5AE8B
Device Start End Sectors Size Type
sd-blob.img1 3057664 43233279 40175616 19.2G Linux filesystem
sd-blob.img2 2048 264191 262144 128M Linux filesystem
sd-blob.img3 264192 265727 1536 768K Linux filesystem
sd-blob.img4 266240 331007 64768 31.6M Linux filesystem
sd-blob.img5 331776 593919 262144 128M Linux filesystem
sd-blob.img6 593920 595455 1536 768K Linux filesystem
sd-blob.img7 595968 660735 64768 31.6M Linux filesystem
sd-blob.img8 661504 825343 163840 80M Linux filesystem
sd-blob.img9 825344 826367 1024 512K Linux filesystem
sd-blob.img10 827392 958463 131072 64M EFI System
sd-blob.img11 958464 1122303 163840 80M Linux filesystem
sd-blob.img12 1122304 1123327 1024 512K Linux filesystem
sd-blob.img13 1124352 1255423 131072 64M Linux filesystem
sd-blob.img14 1255424 3056639 1801216 879.5M Linux filesystem
It works, if I do mount manually as
sudo mount -v -o offset=$((512 * 3057664)) -t ext4 sd-blob.img ~/tmp/arm-runner/
/tmp$ ls arm-runner
bin boot dev etc home lib lost+found media mnt opt proc README.txt root run sbin snap srv sys tmp usr var
Sorry, I missed that I should add rootpartition: 1
. With that change it works:
+ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.5 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.5 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
+ uname -a
Linux fv-az216-153 5.15.0-1036-azure #43-Ubuntu SMP Wed Mar 29 16:11:05 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
There is an error at the end:
Zero-filling unused blocks on boot filesystem...
Zero-filling unused blocks on root filesystem...
Resizing root filesystem to minimal size.
e2fsck: No such file or directory while trying to open /dev/loop3p1
Possibly non-existent device?
/dev/loop3p11
resize2fs 1.46.5 (30-Dec-2021)
open: No such file or directory while opening /dev/loop3p1
/dev/loop3p11
Usage: tune2fs [-c max_mounts_count] [-e errors_behavior] [-f] [-g group]
[-i interval[d|m|w]] [-j] [-J journal_options] [-l]
[-m reserved_blocks_percent] [-o [^]mount_options[,...]]
[-r reserved_blocks_count] [-u user] [-C mount_count]
[-L volume_label] [-M last_mounted_dir]
[-O [^]feature[,...]] [-Q quota_options]
[-E extended-option[,...]] [-T last_check_time] [-U UUID]
[-I new_inode_size] [-z undo_file] device
Usage: tune2fs [-c max_mounts_count] [-e errors_behavior] [-f] [-g group]
[-i interval[d|m|w]] [-j] [-J journal_options] [-l]
[-m reserved_blocks_percent] [-o [^]mount_options[,...]]
[-r reserved_blocks_count] [-u user] [-C mount_count]
[-L volume_label] [-M last_mounted_dir]
[-O [^]feature[,...]] [-Q quota_options]
[-E extended-option[,...]] [-T last_check_time] [-U UUID]
[-I new_inode_size] [-z undo_file] device
Resizing rootfs partition.
/home/runner/work/arm-runner-action/arm-runner-action/.//cleanup_image.sh: line 48: * : syntax error: operand expected (error token is "* ")
Do you need to optimize the image afterwards to use it as an artifact? If not, you can try optimize_image: false
.
Likewise, I wonder if boot_partition
option can help.
I'm not sure yet, does it make an output artifact significantly smaller?
Do you need a PR for NVidia? something like:
build_nvidia_orin:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- name: Increase free space
# Remove Android and dotnet
run: |
sudo rm -rf /usr/local/lib/android
sudo rm -rf /usr/share/dotnet
df -h
- uses: ./ # pguyot/arm-runner-action@HEAD
with:
base_image: https://developer.nvidia.com/downloads/embedded/l4t/r35_release_v3.1/sd_card_b49/jp511-orin-nano-sd-card-image.zip
rootpartition: 1
commands: |
cat /etc/os-release
uname -a
Otherwise, I'm good with what I have now. Thank you for your support.
Thank you. I'm not sure it's a good idea to have this case in the CI of the action as it takes about 7 minutes as the image is so large, and we already have a test with an nvidia image.
Still, I used your example to debug the error message you are having and I am confirming the root cause: you did set the root partition, but you didn't set the boot partition which is 1 by default. So both partitions have the same index and so far the action doesn't complain. I believe you want to have no boot partition, which is possible as shown in this existing test:
https://github.com/pguyot/arm-runner-action/blob/main/.github/workflows/test-partitions.yml
Regarding optimizing the image, it depends if your pipeline tries to get the image as an artifact or not. If it doesn't, you may want to disable optimization of the image as it's just wasted CPU cycles. I mentioned this because the image is very large, so I wondered if you really made it an artifact. See the following test:
https://github.com/pguyot/arm-runner-action/blob/main/.github/workflows/test-optimize_image.yml
I'm fine with it, I just couldn't find your Nvidia example. It is not in the list of supported images.
I have another comment on the filesystem size issue. At the moment there is no check whether the downloadable image is too big to be unzipped and the build process just silently dies without any log message.
One way to resolve it could be to check the size of the downloadable file before wget
it and compare it with available space on the filesystem. E.g. in my case, a zipped image size is 10G + unzipped one is 20G = 30G, and available space is 24G.
I have tried wget
with a --spider
option, it works for the Ubuntu image, but does not work for NVidia.
wget --spider https://cdimage.ubuntu.com/releases/22.04.2/release/ubuntu-22.04.2-preinstalled-server-arm64+raspi.img.xz
Spider mode enabled. Check if remote file exists.
--2023-05-05 12:58:28-- https://cdimage.ubuntu.com/releases/22.04.2/release/ubuntu-22.04.2-preinstalled-server-arm64+raspi.img.xz
Resolving cdimage.ubuntu.com (cdimage.ubuntu.com)... 185.125.190.40, 91.189.91.124, 185.125.190.37, ...
Connecting to cdimage.ubuntu.com (cdimage.ubuntu.com)|185.125.190.40|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1023925356 (976M) [application/x-xz]
Remote file exists.
wget --spider https://developer.nvidia.com/downloads/embedded/l4t/r35_release_v3.1/sd_card_b49/jp511-orin-nano-sd-card-image.zip
Spider mode enabled. Check if remote file exists.
--2023-05-05 13:00:27-- https://developer.nvidia.com/downloads/embedded/l4t/r35_release_v3.1/sd_card_b49/jp511-orin-nano-sd-card-image.zip
Resolving developer.nvidia.com (developer.nvidia.com)... 152.199.20.126
Connecting to developer.nvidia.com (developer.nvidia.com)|152.199.20.126|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist -- broken link!!!
But even comparing e.g. 2*(base_image size) > available size after downloading it would help.
Or check it with
unzip -l jp511-orin-nano-sd-card-image.zip
Length Date Time Name
--------- ---------- ----- ----
22144876544 2023-03-20 04:38 sd-blob.img
--------- -------
22144876544 1 file
What do you think?
Nvidia images are large and GitHub runners have less available free space, so the test partitions was broken. I fixed it by deleting stuff...