canonical/microk8s-core-addons

Kube-OVN addon install fail on some CPU

Closed this issue · 1 comments

Hi,

Summary

If your CPU lack avx2 or avx512 flags, kube-ovn install fail with error like :

Feb 19 12:12:44 k8s-01 kernel: traps: ovsdb-tool[3613626] trap invalid opcode ip:7f1ed71bcb48 sp:7ffd45c08b80 error:0 in libopenvswitch-2.17.so.0.0.2[7f1ed71b9000+17e000]
Feb 19 12:12:44 k8s-01 kernel: traps: ovsdb-tool[3613681] trap invalid opcode ip:7fb5453e9b48 sp:7ffd811505b0 error:0 in libopenvswitch-2.17.so.0.0.2[7fb5453e6000+17e000]
Feb 19 12:12:44 k8s-01 kernel: traps: ovsdb-tool[3613697] trap invalid opcode ip:7fdb26c4fb48 sp:7ffd0827d060 error:0 in libopenvswitch-2.17.so.0.0.2[7fdb26c4c000+17e000]
Feb 19 12:12:44 k8s-01 kernel: traps: ovsdb-tool[3613699] trap invalid opcode ip:7f9355fbab48 sp:7fff688677a0 error:0 in libopenvswitch-2.17.so.0.0.2[7f9355fb7000+17e000]
Feb 19 12:12:44 k8s-01 kernel: traps: ovsdb-server[3613706] trap invalid opcode ip:7facf9e85b48 sp:7ffce4314310 error:0 in libopenvswitch-2.17.so.0.0.2[7facf9e82000+17e000]
Feb 19 12:12:44 k8s-01 kernel: traps: ovn-nbctl[3613716] trap invalid opcode ip:7fc015bacb48 sp:7ffd20502770 error:0 in libopenvswitch-2.17.so.0.0.2[7fc015ba9000+17e000]
Feb 19 12:12:44 k8s-01 kernel: traps: ovsdb-tool[3613728] trap invalid opcode ip:7f6b16d0db48 sp:7ffcd3c58f90 error:0 in libopenvswitch-2.17.so.0.0.2[7f6b16d0a000+17e000]
Feb 19 12:12:44 k8s-01 kernel: traps: ovsdb-client[3613734] trap invalid opcode ip:7fa5612b5b48 sp:7ffce50e26c0 error:0 in libopenvswitch-2.17.so.0.0.2[7fa5612b2000+17e000]
Feb 19 12:12:44 k8s-01 kernel: traps: ovsdb-tool[3613754] trap invalid opcode ip:7f30adb39b48 sp:7ffd0c04dd70 error:0 in libopenvswitch-2.17.so.0.0.2[7f30adb36000+17e000]

This issue is know upstream : kubeovn/kube-ovn#1499 and the offered solution is to use specific image tag labeled : kubeovn/kube-ovn:v1.10.0-no-avx512

What Should Happen Instead?

Microk8s should be able to check for these CPU flags, and select the right image to use.

Reproduction Steps

Install Microk8s (1.28 or 1.29) with kube-ovn addon enabled on a CPU without these flags :

Non working CPU

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 45
model name      : Intel(R) Xeon(R) CPU E5-1650 0 @ 3.20GHz
stepping        : 7
microcode       : 0x71a
cpu MHz         : 1200.000
cache size      : 12288 KB
physical id     : 0
siblings        : 12
core id         : 0
cpu cores       : 6
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_goo
d nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd
 ibrs ibpb stibp tpr_shadow flexpriority ept vpid xsaveopt dtherm ida arat pln pts vnmi md_clear flush_l1d
vmx flags       : vnmi preemption_timer invvpid ept_x_only ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips        : 6384.74
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

Working CPU

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 142
model name      : Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
stepping        : 12
microcode       : 0xffffffff
cpu MHz         : 1992.005
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 21
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid pn
i pclmulqdq ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase bmi1 avx2 smep bmi2 erm
s invpcid rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves flush_l1d arch_capabilities
bugs            : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit srbds mmio_stale_data retbleed eibrs_pbrsb gds
bogomips        : 3984.01
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

If i modify the addon locally to change image tag, and enable it again , everything work as expected. (As a side note, kube-ovn install should also include firewall rules regarding ovn interfaces, like it does for calico. I can create a separate issue regarding this if you think it is necessary)

Can you suggest a fix?

Microk8s should be able to check for these CPU flags, and select the right image to use.

Are you interested in contributing with a fix?

I might be able to provide a fix with some directions on where to start.

Thanks for your issue and PR. you can see the follow up in #285