virt-pvm/linux

PVM guest kernel start failed after enabling FUNCTION_TRACER

Opened this issue · 5 comments

Hi,

When I enable these two options in the kernel configuration file .config
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y

L2 PVM guest(L0 kvm, L1 pvm, L2 pvm) will fail to start(If disabled, it will start normally).

Cloud Virtual Machine L1 PVM dmesg log will show information like:

...
[  121.042351] loop0: detected capacity change from 0 to 21102592
[  121.092013] loop0: detected capacity change from 0 to 21102592
[  121.143412] loop0: detected capacity change from 0 to 21102592
[  121.289885] VFIO - User Level meta-driver version: 0.3
[  121.428387] loop0: detected capacity change from 0 to 21102592
[  121.478115] loop0: detected capacity change from 0 to 42074112
[  121.521191] loop0: detected capacity change from 0 to 8388608
[  175.312249] kvm_intel: VMX not supported by CPU 5
...

or(Booting L2 PVM guest on bare metal, L1 log)

...
[21024.403726] KVM: debugfs: duplicate directory 100691-30
[21024.680447] KVM: debugfs: duplicate directory 100691-30
[21024.966014] KVM: debugfs: duplicate directory 100691-30
[21027.499152] kvm_create_vm_debugfs: 8 callbacks suppressed
[21027.499154] KVM: debugfs: duplicate directory 100691-30
[21027.782950] KVM: debugfs: duplicate directory 100691-30
[21028.063162] KVM: debugfs: duplicate directory 100691-30
[21028.351486] KVM: debugfs: duplicate directory 100691-30
...

I would like to know what the current PVM support for FUNCTION_TRACER is like and whether we have plans to support FUNCTION_TRACER.

Thanks

Hi, @ljrcore, sorry for replying late. I'm not sure why I didn't receive the notification of the issue from GitHub.

I can reproduce it in my environment, and the problem is that __fentry__ is called in pvm_update_pgtable() before calling pvm_relocate_kernel() to perform relocation. GCC is using a GOT reference for __fentry__ instead of RIP-relative under PIE building. This causes it to try to access the compile-address of __fentry__, leading to failure due to the disallowed range checking in the hypervisor. The fix can be to remove ftrace compile flags for the arch/x86/platform/pvh/enlighten.c, similar to arch/x86/kernel/head_64.c, since the PVH entry code is only used in early booting.

commit eb9e51fcb83b1a3494e2cb86ee9603fdadeefd96 (HEAD)
Author: Hou Wenlong <houwenlong.hwl@antgroup.com>
Date:   Mon Aug 5 14:14:16 2024 +0800

    x86/pvm: Don't profile PVH entry code

    When CONFIG_FUNCTION_TRACER is enabled, there will be a call to
    __fentry__ before performing relocation in pvm_update_pgtable().
    However, GCC will generate a GOT reference for __fentry__ under PIE
    building, causing it to try to access the compile-address of __fentry__,
    leading to booting failure. Since the PVH entry code is only used in
    early booting, the ftrace compile flags for the
    arch/x86/platform/pvh/enlighten.c file could be removed.

    Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
    Link: https://github.com/virt-pvm/linux/issues/11

diff --git a/arch/x86/platform/pvh/Makefile b/arch/x86/platform/pvh/Makefile
index 5dec5067c9fb..943c28304c31 100644
--- a/arch/x86/platform/pvh/Makefile
+++ b/arch/x86/platform/pvh/Makefile
@@ -1,5 +1,9 @@
 # SPDX-License-Identifier: GPL-2.0
 OBJECT_FILES_NON_STANDARD_head.o := y

+ifdef CONFIG_FUNCTION_TRACER
+CFLAGS_REMOVE_enlighten.o = -pg
+endif
+
 obj-$(CONFIG_PVH) += enlighten.o
 obj-$(CONFIG_PVH) += head.o

Actually, I believe I encountered and resolved the problem before, but it seems that I missed it in the patchset. :(

After testing, it works.

Thanks

How about using $(CC_FLAGS_FTRACE) to drop all ftrace flags:

diff --git a/arch/x86/platform/pvh/Makefile b/arch/x86/platform/pvh/Makefile
index 5dec5067c..3cfb3b269 100644
--- a/arch/x86/platform/pvh/Makefile
+++ b/arch/x86/platform/pvh/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 OBJECT_FILES_NON_STANDARD_head.o := y
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
 obj-$(CONFIG_PVH) += enlighten.o
 obj-$(CONFIG_PVH) += head.o

How about using $(CC_FLAGS_FTRACE) to drop all ftrace flags:

diff --git a/arch/x86/platform/pvh/Makefile b/arch/x86/platform/pvh/Makefile
index 5dec5067c..3cfb3b269 100644
--- a/arch/x86/platform/pvh/Makefile
+++ b/arch/x86/platform/pvh/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 OBJECT_FILES_NON_STANDARD_head.o := y
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
 obj-$(CONFIG_PVH) += enlighten.o
 obj-$(CONFIG_PVH) += head.o

LGTM.

How about using $(CC_FLAGS_FTRACE) to drop all ftrace flags:

diff --git a/arch/x86/platform/pvh/Makefile b/arch/x86/platform/pvh/Makefile
index 5dec5067c..3cfb3b269 100644
--- a/arch/x86/platform/pvh/Makefile
+++ b/arch/x86/platform/pvh/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 OBJECT_FILES_NON_STANDARD_head.o := y
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
 obj-$(CONFIG_PVH) += enlighten.o
 obj-$(CONFIG_PVH) += head.o

This will drop ftrace flags for all objects in the current directory, even though there is only boot code in the current directory. I may just want to drop ftrace flags for enlighten.o for now to minimize the impact. However, I don't believe there will be any runtime code in the future, so I'm also okay with your suggestion.