oracle/oracle-linux

OL9 Cloud Image Builder

malins opened this issue · 9 comments

malins commented

Hello!

I'm trying to run the cloud image builder on OL9 but it fails with an error.

$ ./bin/build-image.sh --env env.properties
+++ build-image.sh: Parse arguments
+++ build-image.sh: Load environment
+++ build-image.sh: Retrieve installation media OracleLinux-R9-U4-x86_64-dvd.iso
    build-image.sh: downloading https://yum.oracle.com/ISOS/OracleLinux/OL9/u4/x86_64/OracleLinux-R9-U4-x86_64-dvd.iso

+++ build-image.sh: Stage provisioning files
+++ build-image.sh: Stage kickstart file
+++ build-image.sh: Install Oracle Linux
WARNING  --os-type is deprecated and does nothing. Please stop using it.
WARNING  KVM acceleration not available, using 'qemu'

Starting install...
Retrieving 'vmlinuz'                                                                                                                                               |  13 MB  00:00:00     
Retrieving 'initrd.img'                                                                                                                                            | 101 MB  00:00:00     
Allocating 'OL9U4_x86_64-none-b0.qcow2'                                                                                                                            |  15 GB  00:00:00     
Removing disk 'OL9U4_x86_64-none-b0.qcow2'                                                                                                                         |         00:00:00     
ERROR    cannot set CPU affinity on process 36968: Invalid argument
Domain installation does not appear to have been successful.
If it was, you can restart your domain by running:
  virsh --connect qemu:///session start OL9U4_x86_64-none-b0
otherwise, please restart your installation.

The contents of env.properties:

WORKSPACE=/home/user/ws
DISTR=ol9-slim
ISO_URL=https://yum.oracle.com/ISOS/OracleLinux/OL9/u4/x86_64/OracleLinux-R9-U4-x86_64-dvd.iso
ISO_CHECKSUM=77034a4945474cb7c77820bd299cac9a557b8a298a5810c31d63ce404ad13c5e
CLOUD=none

I'm running this inside a virtual machine.

Any ideas what could be the issue here?

Thank you,
Manuel

WARNING KVM acceleration not available, using 'qemu'

It looks like your host VM is not configured to run nested VMs (or the bare metal host doesn't support it)

Is you host VM running on top of KVM or are you using another hypervisor?

malins commented

My VM is running inside VirtualBox with Nested VT-X enabled.

$ cat /sys/module/kvm_intel/parameters/nested
Y

$ lsmod | grep kvm
kvm_intel             393216  0
kvm                  1142784  1 kvm_intel
irqbypass              16384  1 kvm

$ cat /proc/cpuinfo|grep vmx
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti tpr_shadow flexpriority vpid fsgsbase avx2 invpcid rdseed clflushopt vnmi md_clear flush_l1d arch_capabilities
vmx flags       : vnmi flexpriority tsc_offset vtpr vapic
malins commented

I've used https://github.com/dpw/kvm-hello-world to test KVM on the machine and this works without any problems

Note that by default we have

# Number of CPUs for the build VM
CPU_NUM=4
# Memory allocated to the build VM
MEM_SIZE=8192

If you overcommit the number of CPU, that might be the reason for the

ERROR    cannot set CPU affinity on process 36968: Invalid argument

you are getting.

I am still unsure why you are getting the KVM/QEMU warning
(I am building myself in nested VMs, and it works without issues -- not in VirtualBox though)
I assume the permissions on /dev/kvm are correct (otherwise your sample would not work either)

I will try to reproduce...

malins commented

After increasing the number of vCPU in the VM to 4 and reducing CPU_NUM to 3, the error no longer appears.

But even after 90 minutes (INSTALL_WAIT_TIME=90) the script reports that the timeout has exceeded, but /usr/libexec/qemu-kvm is still active with a high cpu (@ 113 minutes CPU time already). What is this thing doing that needs so much computation?

Thank you.

Do you still see the WARNING KVM acceleration not available, using 'qemu' message?
If this is the case it is running in emulation mode and can be very slow.
(I assume it is not the case as you have a qemu-kvm process)

To give an idea, in my environment (nested VM in Oracle OCI) the total build time is 10-12 minutes; a bit more than the half is for the initial install phase.

TBH, I never had great success in term of performance with nested virt in VirtualBox, but that was a long time ago...

You might want to add SERIAL_CONSOLE=Yes in the config file to get a more verbose output. You will see how fast/slow it is and if it is stuck somewhere (the output might be buffered, so you won't always see messages "in real tlme")

malins commented

No, I don't see the "KVM not available warning" anymore.

After setting INSTALL_WAIT_TIME to 200, it finally went through successfully. Maybe I will later investige why inital phase takes so long. For now my issues has been solved, thank you for your assistance.

malins commented

No, I don't see the "KVM not available warning" anymore.

After setting INSTALL_WAIT_TIME to 200, it finally went through successfully. Maybe I will later investige why inital phase takes so long. For now my issues has been solved, thank you for your assistance.

Thanks for the feedback.

I am closing this issue, feel free to reopen or open a new one if needed.